The field of diffusion models and video processing is rapidly evolving, with a focus on improving efficiency, quality, and scalability. Recent developments have led to the creation of hybrid adaptive diffusion models, such as HADIS, which optimize cascade model selection, query routing, and resource allocation to improve response quality and reduce latency. Additionally, bidirectional sparse attention frameworks, like BSA, have been proposed to accelerate video diffusion training by dynamically sparsifying queries and key-value pairs. Other notable advancements include the introduction of quantization-aware scheduling, such as Q-Sched, which achieves full-precision accuracy with reduced model size, and the development of generative video compositing models, like GenCompositor, which enable interactive video editing. Noteworthy papers in this area include HADIS, which improves response quality by up to 35% while reducing latency violation rates, and Q-Sched, which delivers substantial gains in image generation quality with a 4x reduction in model size.
Advancements in Diffusion Models and Video Processing
Sources
DrDiff: Dynamic Routing Diffusion with Hierarchical Attention for Breaking the Efficiency-Quality Trade-off
MICACL: Multi-Instance Category-Aware Contrastive Learning for Long-Tailed Dynamic Facial Expression Recognition