Advances in 3D Perception and Robotic Manipulation

The field of 3D perception and robotic manipulation is rapidly advancing, with a focus on developing more robust and accurate methods for pose estimation, depth estimation, and grasping in cluttered environments. Recent research has explored the use of novel architectures, such as dual-stream networks and probabilistic frameworks, to improve the accuracy and reliability of 3D perception tasks. Additionally, there is a growing interest in developing methods that can effectively handle uncertainty and ambiguity in 3D perception, such as those that model pose distributions or use active perception strategies. Notable papers in this area include VLM6D, which proposes a novel dual-stream architecture for 6D pose estimation, and SE(3)-PoseFlow, which introduces a probabilistic framework for estimating 6D pose distributions. GraspView is also noteworthy, as it presents an RGB-only robotic grasping pipeline that achieves accurate manipulation in cluttered environments without depth sensors.

Sources

VLM6D: VLM based 6Dof Pose Estimation based on RGB-D Images

Discriminately Treating Motion Components Evolves Joint Depth and Ego-Motion Learning

SE(3)-PoseFlow: Estimating 6D Pose Distributions for Uncertainty-Aware Robotic Manipulation

Are Euler angles a useful rotation parameterisation for pose estimation with Normalizing Flows?

Room Envelopes: A Synthetic Dataset for Indoor Layout Reconstruction from Images

GraspView: Active Perception Scoring and Best-View Optimization for Robotic Grasping in Cluttered Environments

BoRe-Depth: Self-supervised Monocular Depth Estimation with Boundary Refinement for Embedded Systems

Built with on top of