BREAKING NEWS
LATEST POSTS
-
OpenCV Python for Computer Vision
-
DepthCrafter – Generating Consistent Normals Long Depth Sequences for Open-world Videos
https://depthcrafter.github.io/
We innovate DepthCrafter, a novel video depth estimation approach, by leveraging video diffusion models. It can generate temporally consistent long depth sequences with fine-grained details for open-world videos, without requiring additional information such as camera poses or optical flow.
-
ByteDance Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency
https://loopyavatar.github.io/
Loopy supports various visual and audio styles. It can generate vivid motion details from audio alone, such as non-speech movements like sighing, emotion-driven eyebrow and eye movements, and natural head movements.
-
Ross Pettit on The Agile Manager – How tech firms went for prioritizing cash flow instead of talent (and artists)
For years, tech firms were fighting a war for talent. Now they are waging war on talent.
This shift has led to a weakening of the social contract between employees and employers, with culture and employee values being sidelined in favor of financial discipline and free cash flow.
The operating environment has changed from a high tolerance for failure (where cheap capital and willing spenders accepted slipped dates and feature lag) to a very low – if not zero – tolerance for failure (fiscal discipline is in vogue again).
While preventing and containing mistakes staves off shocks to the income statement, it doesn’t fundamentally reduce costs. Years of payroll bloat – aggressive hiring, aggressive comp packages to attract and retain people – make labor the biggest cost in tech.
…Of course, companies can reduce their labor force through natural attrition. Other labor policy changes – return to office mandates, contraction of fringe benefits, reduction of job promotions, suspension of bonuses and comp freezes – encourage more people to exit voluntarily. It’s cheaper to let somebody self-select out than it is to lay them off.
…Employees recruited in more recent years from outside the ranks of tech were given the expectation that we’ll teach you what you need to know, we want you to join because we value what you bring to the table. That is no longer applicable. Runway for individual growth is very short in zero-tolerance-for-failure operating conditions. Job preservation, at least in the short term for this cohort, comes from completing corporate training and acquiring professional certifications. Training through community or experience is not in the cards.
…The ability to perform competently in multiple roles, the extra-curriculars, the self-directed enrichment, the ex-company leadership – all these things make no matter. The calculus is what you got paid versus how you performed on objective criteria relative to your cohort. Nothing more.
…Here is where the change in the social contract is perhaps the most blatant. In the “destination employer” years, the employee invested in the community and its values, and the employer rewarded the loyalty of its employees through things like runway for growth (stretch roles and sponsored work innovation) and tolerance for error (valuing demonstrable learning over perfection in execution). No longer.
…http://www.rosspettit.com/2024/08/for-years-tech-was-fighting-war-for.html
-
2DGS – 2D Gaussian Splatting for Geometrically Accurate Radiance Fields
A 2D Gaussian Splats technique for extracting cleaner 3D geometry from 3DGS
https://github.com/hbb1/2d-gaussian-splatting
https://surfsplatting.github.io/
https://colab.research.google.com/drive/1qoclD7HJ3-o0O1R8cvV3PxLhoDCMsH8W
3D Gaussian Splatting (3DGS) has recently revolutionized radiance field reconstruction, achieving high quality novel view synthesis and fast rendering speed without baking. However, 3DGS fails to accurately represent surfaces due to the multi-view inconsistent nature of 3D Gaussians. We present 2D Gaussian Splatting (2DGS), a novel approach to model and reconstruct geometrically accurate radiance fields from multi-view images. Our key idea is to collapse the 3D volume into a set of 2D oriented planar Gaussian disks. Unlike 3D Gaussians, 2D Gaussians provide view-consistent geometry while modeling surfaces intrinsically. To accurately recover thin surfaces and achieve stable optimization, we introduce a perspective-accurate 2D splatting process utilizing ray-splat intersection and rasterization. Additionally, we incorporate depth distortion and normal consistency terms to further enhance the quality of the reconstructions. We demonstrate that our differentiable renderer allows for noise-free and detailed geometry reconstruction while maintaining competitive appearance quality, fast training speed, and real-time rendering.
-
Kiosk – Library Tool for 3D Artists
Kiosk streamlines resource management. With tailored filtering, customizable organization, and seamless integration into Maya, Houdini, Blender and Cinema4D. Maintain one library for them all!
https://fabianstrube.gumroad.com/l/kiosk-library
FEATURED POSTS
-
Hunyuan video-to-video re-styling
The open-source community has figured out how to run Hunyuan V2V using LoRAs.
You’ll need to install Kijai’s ComfyUI-HunyuanLoom and LoRAs, which you can either train yourself or find on Civitai.
1) you’ll need HunyuanLoom, after install, workflow found in the repo.
https://github.com/logtd/ComfyUI-HunyuanLoom
2) John Wick lora found here.
https://civitai.com/models/1131159/john-wick-hunyuan-video-lora
-
Guide to Prompt Engineering
The 10 most powerful techniques:
1. Communicate the Why
2. Explain the context (strategy, data)
3. Clearly state your objectives
4. Specify the key results (desired outcomes)
5. Provide an example or template
6. Define roles and use the thinking hats
7. Set constraints and limitations
8. Provide step-by-step instructions (CoT)
9. Ask to reverse-engineer the result to get a prompt
10. Use markdown or XML to clearly separate sections (e.g., examples)
Top 10 high-ROI use cases for PMs:
1. Get new product ideas
2. Identify hidden assumptions
3. Plan the right experiments
4. Summarize a customer interview
5. Summarize a meeting
6. Social listening (sentiment analysis)
7. Write user stories
8. Generate SQL queries for data analysis
9. Get help with PRD and other templates
10. Analyze your competitorsQuick prompting scheme:
1- pass an image to JoyCaption
https://www.pixelsham.com/2024/12/23/joy-caption-alpha-two-free-automatic-caption-of-images/
2- tune the caption with ChatGPT as suggested by Pixaroma:
Craft detailed prompts for Al (image/video) generation, avoiding quotation marks. When I provide a description or image, translate it into a prompt that captures a cinematic, movie-like quality, focusing on elements like scene, style, mood, lighting, and specific visual details. Ensure that the prompt evokes a rich, immersive atmosphere, emphasizing textures, depth, and realism. Always incorporate (static/slow) camera or cinematic movement to enhance the feeling of fluidity and visual storytelling. Keep the wording precise yet descriptive, directly usable, and designed to achieve a high-quality, film-inspired result.
https://www.reddit.com/r/ChatGPT/comments/139mxi3/chatgpt_created_this_guide_to_prompt_engineering/
1. Use the 80/20 principle to learn faster
Prompt: “I want to learn about [insert topic]. Identify and share the most important 20% of learnings from this topic that will help me understand 80% of it.”
2. Learn and develop any new skill
Prompt: “I want to learn/get better at [insert desired skill]. I am a complete beginner. Create a 30-day learning plan that will help a beginner like me learn and improve this skill.”
3. Summarize long documents and articles
Prompt: “Summarize the text below and give me a list of bullet points with key insights and the most important facts.” [Insert text]
4. Train ChatGPT to generate prompts for you
Prompt: “You are an AI designed to help [insert profession]. Generate a list of the 10 best prompts for yourself. The prompts should be about [insert topic].”
5. Master any new skill
Prompt: “I have 3 free days a week and 2 months. Design a crash study plan to master [insert desired skill].”
6. Simplify complex information
Prompt: “Break down [insert topic] into smaller, easier-to-understand parts. Use analogies and real-life examples to simplify the concept and make it more relatable.”
More suggestions under the post…
(more…)
-
Black Forest Labs released FLUX.1 Kontext
https://replicate.com/blog/flux-kontext
https://replicate.com/black-forest-labs/flux-kontext-pro
There are three models, two are available now, and a third open-weight version is coming soon:
- FLUX.1 Kontext [pro]: State-of-the-art performance for image editing. High-quality outputs, great prompt following, and consistent results.
- FLUX.1 Kontext [max]: A premium model that brings maximum performance, improved prompt adherence, and high-quality typography generation without compromise on speed.
- Coming soon: FLUX.1 Kontext [dev]: An open-weight, guidance-distilled version of Kontext.
We’re so excited with what Kontext can do, we’ve created a collection of models on Replicate to give you ideas:
- Multi-image kontext: Combine two images into one.
- Portrait series: Generate a series of portraits from a single image
- Change haircut: Change a person’s hair style and color
- Iconic locations: Put yourself in front of famous landmarks
- Professional headshot: Generate a professional headshot from any image