Diffusion Models for Human Pose and Virtual Try-On

The field of computer vision is witnessing significant advancements in the estimation of human pose and virtual try-on. Researchers are increasingly leveraging diffusion models to address the complexities of articulated human poses and the scarcity of high-quality datasets. These models have shown great promise in unifying various pose-centric tasks and solving them through variational diffusion sampling. Furthermore, diffusion-based frameworks are being applied to virtual try-on, enabling more realistic and end-to-end garment synthesis. Notably, these frameworks are becoming more versatile, allowing for arbitrary poses and mask-free operations. The development of these models is expected to have a significant impact on applications such as e-commerce and entertainment. Noteworthy papers include: DPoser-X, which presents a diffusion-based prior model for 3D whole-body human poses, and Voost, which proposes a unified and scalable diffusion transformer for bidirectional virtual try-on and try-off. Additionally, One Model For All introduces a unified diffusion framework for both virtual try-on and try-off that operates without the need for exhibition garments and supports arbitrary poses.

Sources

DPoser-X: Diffusion Model as Robust 3D Whole-body Human Pose Prior

DreamVVT: Mastering Realistic Video Virtual Try-On in the Wild via a Stage-Wise Diffusion Transformer Framework

OmniShape: Zero-Shot Multi-Hypothesis Shape and Pose Estimation in the Real World

MonoCloth: Reconstruction and Animation of Cloth-Decoupled Human Avatars from Monocular Videos

Two-Way Garment Transfer: Unified Diffusion Framework for Dressing and Undressing Synthesis

One Model For All: Partial Diffusion for Unified Try-On and Try-Off in Any Pose

Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Off

Built with on top of