FlashVSR is a streaming, one-step diffusion-based video super-resolution framework with block-sparse attention and a Tiny Conditional Decoder. It reaches ~17 FPS at 768×1408 on a single A100 GPU. A Locality-Constrained Attention design further improves generalization and perceptual quality on ultra-high-resolution videos.
https://zhuang2002.github.io/FlashVSR/index.html
https://huggingface.co/JunhaoZhuang/FlashVSR
https://zhuang2002.github.io/FlashVSR/comparisons.html





