Advances in 3D Human Pose and Shape Estimation

The field of 3D human pose and shape estimation is rapidly advancing, with a focus on improving performance in challenging scenarios such as occlusions and complex human poses. Recent developments have led to the creation of new benchmark datasets, such as those that incorporate realistic occlusions, which are essential for training and evaluating methods in this area. Additionally, innovative approaches have been proposed to improve the generalization of lifting-based 3D human pose estimation methods, enabling them to perform better on unseen datasets. Cross-domain learning frameworks have also been developed to address long-horizon tasks in human-scene interaction, demonstrating significant improvements in task success rates and execution efficiency. Furthermore, researchers have explored the use of RGBD cameras for 3D human mesh estimation, leveraging the additional depth data to achieve accurate results. Noteworthy papers in this area include: VOccl3D, which introduces a novel benchmark dataset for 3D human pose and shape estimation under real occlusions, and AugLift, which proposes a simple yet effective reformulation of the standard lifting pipeline to improve generalization performance. DETACH is also notable for its cross-domain learning framework for long-horizon tasks, and M^3 for its masked autoencoder approach to 3D human mesh estimation from single-view RGBD images. Waymo-3DSkelMo and Human-in-Context are also significant contributions, providing a large-scale dataset for pedestrian interaction modeling and a unified cross-domain 3D human motion modeling approach, respectively.

Sources

VOccl3D: A Video Benchmark Dataset for 3D Human Pose and Shape Estimation under real Occlusions

AugLift: Boosting Generalization in Lifting-based 3D Human Pose Estimation

DETACH: Cross-domain Learning for Long-Horizon Tasks via Mixture of Disentangled Experts

3D Human Mesh Estimation from Single View RGBD

Waymo-3DSkelMo: A Multi-Agent 3D Skeletal Motion Dataset for Pedestrian Interaction Modeling in Autonomous Driving

Human-in-Context: Unified Cross-Domain 3D Human Motion Modeling via In-Context Learning

Built with on top of