BREAKING NEWS
LATEST POSTS
-
Marigold – repurposing diffusion-based image generators for dense predictions
Marigold repurposes Stable Diffusion for dense prediction tasks such as monocular depth estimation and surface normal prediction, delivering a level of detail often missing even in top discriminative models.
Key aspects that make it great:
– Reuses the original VAE and only lightly fine-tunes the denoising UNet
– Trained on just tens of thousands of synthetic image–modality pairs
– Runs on a single consumer GPU (e.g., RTX 4090)
– Zero-shot generalization to real-world, in-the-wild imageshttps://mlhonk.substack.com/p/31-marigold
https://arxiv.org/pdf/2505.09358
https://marigoldmonodepth.github.io/
-
Runway Aleph
https://runwayml.com/research/introducing-runway-aleph
Generate New Camera Angles
Generate the Next Shot
Use Any Style to Transfer to a Video
Change Environments, Locations, Seasons and Time of Day
Add Things to a Scene
Remove Things from a Scene
Change Objects in a Scene
Apply the Motion of a Video to an Image
Alter a Character’s Appearance
Recolor Elements of a Scene
Relight Shots
Green Screen Any Object, Person or Situation -
Mike Wong – AtoMeow – A Blue noise image stippling in Processing
https://github.com/mwkm/atoMeow
https://www.shadertoy.com/view/7s3XzX
This demo is created for coders who are familiar with this awesome creative coding platform. You may quickly modify the code to work for video or to stipple your own Procssing drawings by turning them into
PImage
and run the simulation. This demo code also serves as a reference implementation of my article Blue noise sampling using an N-body simulation-based method. If you are interested in 2.5D, you may mod the code to achieve what I discussed in this artist friendly article.Convert your video to a dotted noise.
-
Aitor Echeveste – Free CG and Comp Projection Shot, Download the Assets & Follow the Workflow
What’s Included:
- Cleaned and extended base plates
- Full Maya and Nuke 3D projection layouts
- Bullet and environment CG renders with AOVs (RGB, normals, position, ID, etc.)
- Explosion FX in slow motion
- 3D scene geometry for projection
- Camera + lensing setup
- Light groups and passes for look development
-
Tauseef Fayyaz About readable code – Clean Code Practices
𝗛𝗲𝗿𝗲’𝘀 𝘄𝗵𝗮𝘁 𝘁𝗼 𝗺𝗮𝘀𝘁𝗲𝗿 𝗶𝗻 𝗖𝗹𝗲𝗮𝗻 𝗖𝗼𝗱𝗲 𝗣𝗿𝗮𝗰𝘁𝗶𝗰𝗲𝘀:
🔹 Code Readability & Simplicity – Use meaningful names, write short functions, follow SRP, flatten logic, and remove dead code.
→ Clarity is a feature.
🔹 Function & Class Design – Limit parameters, favor pure functions, small classes, and composition over inheritance.
→ Structure drives scalability.
🔹 Testing & Maintainability – Write readable unit tests, avoid over-mocking, test edge cases, and refactor with confidence.
→ Test what matters.
🔹 Code Structure & Architecture – Organize by features, minimize global state, avoid god objects, and abstract smartly.
→ Architecture isn’t just backend.
🔹 Refactoring & Iteration – Apply the Boy Scout Rule, DRY, KISS, and YAGNI principles regularly.
→ Refactor like it’s part of development.
🔹 Robustness & Safety – Validate early, handle errors gracefully, avoid magic numbers, and favor immutability.
→ Safe code is future-proof.
🔹 Documentation & Comments – Let your code explain itself. Comment why, not what, and document at the source.
→ Good docs reduce team friction.
🔹 Tooling & Automation – Use linters, formatters, static analysis, and CI reviews to automate code quality.
→ Let tools guard your gates.
🔹 Final Review Practices – Review, refactor nearby code, and avoid cleverness in the name of brevity.
→ Readable code is better than smart code. -
Mark Theriault “Steamboat Willie” – AI Re-Imagining of a 1928 Classic in 4k
I ran Steamboat Willie (now public domain) through Flux Kontext to reimagine it as a 3D-style animated piece. Instead of going the polished route with something like W.A.N. 2.1 for full image-to-video generation, I leaned into the raw, handmade vibe that comes from converting each frame individually. It gave it a kind of stop-motion texture, imperfect, a bit wobbly, but full of character.
-
Microsoft DAViD – Data-efficient and Accurate Vision Models from Synthetic Data
Our human-centric dense prediction model delivers high-quality, detailed (depth) results while achieving remarkable efficiency, running orders of magnitude faster than competing methods, with inference speeds as low as 21 milliseconds per frame (the large multi-task model on an NVIDIA A100). It reliably captures a wide range of human characteristics under diverse lighting conditions, preserving fine-grained details such as hair strands and subtle facial features. This demonstrates the model’s robustness and accuracy in complex, real-world scenarios.
https://microsoft.github.io/DAViD
The state of the art in human-centric computer vision achieves high accuracy and robustness across a diverse range of tasks. The most effective models in this domain have billions of parameters, thus requiring extremely large datasets, expensive training regimes, and compute-intensive inference. In this paper, we demonstrate that it is possible to train models on much smaller but high-fidelity synthetic datasets, with no loss in accuracy and higher efficiency. Using synthetic training data provides us with excellent levels of detail and perfect labels, while providing strong guarantees for data provenance, usage rights, and user consent. Procedural data synthesis also provides us with explicit control on data diversity, that we can use to address unfairness in the models we train. Extensive quantitative assessment on real input images demonstrates accuracy of our models on three dense prediction tasks: depth estimation, surface normal estimation, and soft foreground segmentation. Our models require only a fraction of the cost of training and inference when compared with foundational models of similar accuracy.
-
-
Embedding frame ranges into Quicktime movies with FFmpeg
QuickTime (.mov) files are fundamentally time-based, not frame-based, and so don’t have a built-in, uniform “first frame/last frame” field you can set as numeric frame IDs. Instead, tools like Shotgun Create rely on the timecode track and the movie’s duration to infer frame numbers. If you want Shotgun to pick up a non-default frame range (e.g. start at 1001, end at 1064), you must bake in an SMPTE timecode that corresponds to your desired start frame, and ensure the movie’s duration matches your clip length.
How Shotgun Reads Frame Ranges
- Default start frame is 1. If no timecode metadata is present, Shotgun assumes the movie begins at frame 1.
- Timecode ⇒ frame number. Shotgun Create “honors the timecodes of media sources,” mapping the embedded TC to frame IDs. For example, a 24 fps QuickTime tagged with a start timecode of 00:00:41:17 will be interpreted as beginning on frame 1001 (1001 ÷ 24 fps ≈ 41.71 s).
Embedding a Start Timecode
QuickTime uses a
tmcd
(timecode) track. You can bake in an SMPTE track via FFmpeg’s-timecode
flag or via Compressor/encoder settings:- Compute your start TC.
- Desired start frame = 1001
- Frame 1001 at 24 fps ⇒ 1001 ÷ 24 ≈ 41.708 s ⇒ TC 00:00:41:17
- FFmpeg example:
ffmpeg -i input.mov \ -c copy \ -timecode 00:00:41:17 \ output.mov
This adds a timecode track beginning at 00:00:41:17, which Shotgun maps to frame 1001.
Ensuring the Correct End Frame
Shotgun infers the last frame from the movie’s duration. To end on frame 1064:
- Frame count = 1064 – 1001 + 1 = 64 frames
- Duration = 64 ÷ 24 fps ≈ 2.667 s
FFmpeg trim example:
ffmpeg -i input.mov \ -c copy \ -timecode 00:00:41:17 \ -t 00:00:02.667 \ output_trimmed.mov
This results in a 64-frame clip (1001→1064) at 24 fps.
FEATURED POSTS
-
How Good Is ChatGPT at Coding, Really?
https://spectrum.ieee.org/chatgpt-for-coding
ChatGPT has not been exposed yet to new problems and solutions. It lacks the critical thinking skills of a human and can only address problems it has previously encountered. This could explain why it is so much better at addressing older coding problems than newer ones.
“ChatGPT may generate incorrect code because it does not understand the meaning of algorithm problems, thus, simple error feedback information is not enough,”
-
AnimationXpress.com interviews Daniele Tosti for TheCgCareer.com channel
You’ve been in the VFX Industry for over a decade. Tell us about your journey.
It all started with my older brother giving me a Commodore64 personal computer as a gift back in the late 80′. I realised then I could create something directly from my imagination using this new digital media format. And, eventually, make a living in the process.
That led me to start my professional career in 1990. From live TV to games to animation. All the way to live action VFX in the recent years.I really never stopped to crave to create art since those early days. And I have been incredibly fortunate to work with really great talent along the way, which made my journey so much more effective.
What inspired you to pursue VFX as a career?
An incredible combination of opportunities, really. The opportunity to express myself as an artist and earn money in the process. The opportunity to learn about how the world around us works and how best solve problems. The opportunity to share my time with other talented people with similar passions. The opportunity to grow and adapt to new challenges. The opportunity to develop something that was never done before. A perfect storm of creativity that fed my continuous curiosity about life and genuinely drove my inspiration.
Tell us about the projects you’ve particularly enjoyed working on in your career
(more…)
-
AI Data Laundering: How Academic and Nonprofit Researchers Shield Tech Companies from Accountability
“Simon Willison created a Datasette browser to explore WebVid-10M, one of the two datasets used to train the video generation model, and quickly learned that all 10.7 million video clips were scraped from Shutterstock, watermarks and all.”
“In addition to the Shutterstock clips, Meta also used 10 million video clips from this 100M video dataset from Microsoft Research Asia. It’s not mentioned on their GitHub, but if you dig into the paper, you learn that every clip came from over 3 million YouTube videos.”
“It’s become standard practice for technology companies working with AI to commercially use datasets and models collected and trained by non-commercial research entities like universities or non-profits.”
“Like with the artists, photographers, and other creators found in the 2.3 billion images that trained Stable Diffusion, I can’t help but wonder how the creators of those 3 million YouTube videos feel about Meta using their work to train their new model.”
-
9 Best Hacks to Make a Cinematic Video with Any Camera
https://www.flexclip.com/learn/cinematic-video.html
- Frame Your Shots to Create Depth
- Create Shallow Depth of Field
- Avoid Shaky Footage and Use Flexible Camera Movements
- Properly Use Slow Motion
- Use Cinematic Lighting Techniques
- Apply Color Grading
- Use Cinematic Music and SFX
- Add Cinematic Fonts and Text Effects
- Create the Cinematic Bar at the Top and the Bottom