Advances in Object Pose Estimation and Robotic Control

The field of computer vision and robotics is moving towards more generalizable and scalable solutions for object pose estimation and robotic control. Researchers are shifting their focus from instance-level methods to category-level and open-set methods, which can handle unknown textures, shapes, and sizes. This shift requires the development of more robust algorithms and datasets that can bridge the gap between category-level and open-set object pose estimation. Notable papers in this area include:

  • PRISM-DP, which enables compact diffusion policy learning directly from spatial poses of task-relevant objects, eliminating the need for manual mesh processing or creation.
  • PRISM, which proposes an integrated real-to-sim-to-real pipeline for scene-aware robotic control with few demonstrations.
  • A zero-shot part assembly method that leverages pretrained diffusion models to guide the manipulation of parts and form realistic shapes.

Sources

Category-Level and Open-Set Object Pose Estimation for Robotics

PRISM-DP: Spatial Pose-based Observations for Diffusion-Policies via Segmentation, Mesh Generation, and Pose Tracking

PRISM: Projection-based Reward Integration for Scene-Aware Real-to-Sim-to-Real Transfer with Few Demonstrations

Leveraging Pretrained Diffusion Models for Zero-Shot Part Assembly

Built with on top of