Advances in 3D Perception and Robotics

The field of 3D perception and robotics is rapidly advancing, with a focus on developing more accurate and efficient methods for tasks such as object pose estimation, 3D reconstruction, and visual localization. Recent research has explored the use of diffusion models, Gaussian Splatting, and other techniques to improve the accuracy and robustness of these methods. Notably, the use of uncertainty estimates and probabilistic approaches has become increasingly popular, allowing for more informed decision-making and improved performance in real-world applications. Additionally, there is a growing interest in developing more generalizable and scalable methods, capable of handling complex and dynamic environments. Overall, the field is moving towards more sophisticated and autonomous systems, with potential applications in areas such as robotics, computer vision, and agriculture. Noteworthy papers include UnPose, which proposes a novel framework for zero-shot 6D object pose estimation and reconstruction, and PVNet, which presents a diffusion model-based point-voxel interaction framework for LiDAR point cloud upsampling. Other notable works include GSVisLoc, which introduces a visual localization method designed for 3D Gaussian Splatting scene representations, and PAUL, which proposes a novel framework for robust cross-view geo-localization under noisy correspondence.

Sources

UnPose: Uncertainty-Guided Diffusion Priors for Zero-Shot Pose Estimation

A Unified Voxel Diffusion Module for Point Cloud 3D Object Detection

GPL-SLAM: A Laser SLAM Framework with Gaussian Process Based Extended Landmarks

PVNet: Point-Voxel Interaction LiDAR Scene Upsampling Via Diffusion Models

Fiducial Marker Splatting for High-Fidelity Robotics Simulations

4D Visual Pre-training for Robot Learning

GWM: Towards Scalable Gaussian World Models for Robotic Manipulation

Camera Pose Refinement via 3D Gaussian Splatting

Variational Shape Inference for Grasp Diffusion on SE(3)

GSVisLoc: Generalizable Visual Localization for Gaussian Splatting Scene Representations

SignLoc: Robust Localization using Navigation Signs and Public Maps

AgriChrono: A Multi-modal Dataset Capturing Crop Growth and Lighting Variability with a Field Robot

Can we make NeRF-based visual localization privacy-preserving?

DATR: Diffusion-based 3D Apple Tree Reconstruction Framework with Sparse-View

Beyond BEV: Optimizing Point-Level Tokens for Collaborative Perception

Integrating SAM Supervision for 3D Weakly Supervised Point Cloud Segmentation

PAUL: Uncertainty-Guided Partition and Augmentation for Robust Cross-View Geo-Localization under Noisy Correspondence

Adam SLAM - the last mile of camera calibration with 3DGS

Enhancing Pseudo-Boxes via Data-Level LiDAR-Camera Fusion for Unsupervised 3D Object Detection

ActLoc: Learning to Localize on the Move via Active Viewpoint Selection

Built with on top of