BREAKING NEWS
LATEST POSTS
-
OpenCV Python for Computer Vision
-
DepthCrafter – Generating Consistent Normals Long Depth Sequences for Open-world Videos
https://depthcrafter.github.io/
We innovate DepthCrafter, a novel video depth estimation approach, by leveraging video diffusion models. It can generate temporally consistent long depth sequences with fine-grained details for open-world videos, without requiring additional information such as camera poses or optical flow.
-
ByteDance Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency
https://loopyavatar.github.io/
Loopy supports various visual and audio styles. It can generate vivid motion details from audio alone, such as non-speech movements like sighing, emotion-driven eyebrow and eye movements, and natural head movements.
-
Ross Pettit on The Agile Manager – How tech firms went for prioritizing cash flow instead of talent (and artists)
For years, tech firms were fighting a war for talent. Now they are waging war on talent.
This shift has led to a weakening of the social contract between employees and employers, with culture and employee values being sidelined in favor of financial discipline and free cash flow.
The operating environment has changed from a high tolerance for failure (where cheap capital and willing spenders accepted slipped dates and feature lag) to a very low – if not zero – tolerance for failure (fiscal discipline is in vogue again).
While preventing and containing mistakes staves off shocks to the income statement, it doesn’t fundamentally reduce costs. Years of payroll bloat – aggressive hiring, aggressive comp packages to attract and retain people – make labor the biggest cost in tech.
…Of course, companies can reduce their labor force through natural attrition. Other labor policy changes – return to office mandates, contraction of fringe benefits, reduction of job promotions, suspension of bonuses and comp freezes – encourage more people to exit voluntarily. It’s cheaper to let somebody self-select out than it is to lay them off.
…Employees recruited in more recent years from outside the ranks of tech were given the expectation that we’ll teach you what you need to know, we want you to join because we value what you bring to the table. That is no longer applicable. Runway for individual growth is very short in zero-tolerance-for-failure operating conditions. Job preservation, at least in the short term for this cohort, comes from completing corporate training and acquiring professional certifications. Training through community or experience is not in the cards.
…The ability to perform competently in multiple roles, the extra-curriculars, the self-directed enrichment, the ex-company leadership – all these things make no matter. The calculus is what you got paid versus how you performed on objective criteria relative to your cohort. Nothing more.
…Here is where the change in the social contract is perhaps the most blatant. In the “destination employer” years, the employee invested in the community and its values, and the employer rewarded the loyalty of its employees through things like runway for growth (stretch roles and sponsored work innovation) and tolerance for error (valuing demonstrable learning over perfection in execution). No longer.
…http://www.rosspettit.com/2024/08/for-years-tech-was-fighting-war-for.html
-
2DGS – 2D Gaussian Splatting for Geometrically Accurate Radiance Fields
A 2D Gaussian Splats technique for extracting cleaner 3D geometry from 3DGS
https://github.com/hbb1/2d-gaussian-splatting
https://surfsplatting.github.io/
https://colab.research.google.com/drive/1qoclD7HJ3-o0O1R8cvV3PxLhoDCMsH8W
3D Gaussian Splatting (3DGS) has recently revolutionized radiance field reconstruction, achieving high quality novel view synthesis and fast rendering speed without baking. However, 3DGS fails to accurately represent surfaces due to the multi-view inconsistent nature of 3D Gaussians. We present 2D Gaussian Splatting (2DGS), a novel approach to model and reconstruct geometrically accurate radiance fields from multi-view images. Our key idea is to collapse the 3D volume into a set of 2D oriented planar Gaussian disks. Unlike 3D Gaussians, 2D Gaussians provide view-consistent geometry while modeling surfaces intrinsically. To accurately recover thin surfaces and achieve stable optimization, we introduce a perspective-accurate 2D splatting process utilizing ray-splat intersection and rasterization. Additionally, we incorporate depth distortion and normal consistency terms to further enhance the quality of the reconstructions. We demonstrate that our differentiable renderer allows for noise-free and detailed geometry reconstruction while maintaining competitive appearance quality, fast training speed, and real-time rendering.
-
Kiosk – Library Tool for 3D Artists
Kiosk streamlines resource management. With tailored filtering, customizable organization, and seamless integration into Maya, Houdini, Blender and Cinema4D. Maintain one library for them all!
https://fabianstrube.gumroad.com/l/kiosk-library
FEATURED POSTS
-
Aider.chat – A free, open-source AI pair-programming CLI tool
Aider enables developers to interactively generate, modify, and test code by leveraging both cloud-hosted and local LLMs directly from the terminal or within an IDE. Key capabilities include comprehensive codebase mapping, support for over 100 programming languages, automated git commit messages, voice-to-code interactions, and built-in linting and testing workflows. Installation is straightforward via pip or uv, and while the tool itself has no licensing cost, actual usage costs stem from the underlying LLM APIs, which are billed separately by providers like OpenAI or Anthropic.
Key Features
- Cloud & Local LLM Support
Connect to most major LLM providers out of the box, or run models locally for privacy and cost control aider.chat. - Codebase Mapping
Automatically indexes all project files so that even large repositories can be edited contextually aider.chat. - 100+ Language Support
Works with Python, JavaScript, Rust, Ruby, Go, C++, PHP, HTML, CSS, and dozens more aider.chat. - Git Integration
Generates sensible commit messages and automates diffs/undo operations through familiar git tooling aider.chat. - Voice-to-Code
Speak commands to Aider to request features, tests, or fixes without typing aider.chat. - Images & Web Pages
Attach screenshots, diagrams, or documentation URLs to provide visual context for edits aider.chat. - Linting & Testing
Runs lint and test suites automatically after each change, and can fix issues it detects
- Cloud & Local LLM Support
-
Gamma correction
http://www.normankoren.com/makingfineprints1A.html#Gammabox
https://en.wikipedia.org/wiki/Gamma_correction
http://www.photoscientia.co.uk/Gamma.htm
https://www.w3.org/Graphics/Color/sRGB.html
http://www.eizoglobal.com/library/basics/lcd_display_gamma/index.html
https://forum.reallusion.com/PrintTopic308094.aspx
Basically, gamma is the relationship between the brightness of a pixel as it appears on the screen, and the numerical value of that pixel. Generally Gamma is just about defining relationships.
Three main types:
– Image Gamma encoded in images
– Display Gammas encoded in hardware and/or viewing time
– System or Viewing Gamma which is the net effect of all gammas when you look back at a final image. In theory this should flatten back to 1.0 gamma.
-
AI Data Laundering: How Academic and Nonprofit Researchers Shield Tech Companies from Accountability
“Simon Willison created a Datasette browser to explore WebVid-10M, one of the two datasets used to train the video generation model, and quickly learned that all 10.7 million video clips were scraped from Shutterstock, watermarks and all.”
“In addition to the Shutterstock clips, Meta also used 10 million video clips from this 100M video dataset from Microsoft Research Asia. It’s not mentioned on their GitHub, but if you dig into the paper, you learn that every clip came from over 3 million YouTube videos.”
“It’s become standard practice for technology companies working with AI to commercially use datasets and models collected and trained by non-commercial research entities like universities or non-profits.”
“Like with the artists, photographers, and other creators found in the 2.3 billion images that trained Stable Diffusion, I can’t help but wonder how the creators of those 3 million YouTube videos feel about Meta using their work to train their new model.”