Advancements in 3D Vision and Segmentation

The field of 3D vision and segmentation is rapidly evolving, with a focus on developing more accurate and efficient methods for understanding and interpreting 3D data. Recent developments have seen a shift towards open-world part segmentation, where models are trained to segment 3D objects into parts without being limited to a specific taxonomy. This has led to the development of more generalizable and scalable models, capable of handling complex and diverse 3D data. Notable papers in this area include PartSAM, which introduces a promptable part segmentation model trained natively on large-scale 3D data, and MUSplat, which presents a training-free framework for lifting 2D open-vocabulary understanding into 3D Gaussian Splatting scenes. Other noteworthy papers include Diff-3DCap, which employs a sequence of projected views to represent a 3D object and a continuous diffusion model to facilitate the captioning process, and PinPoint3D, which introduces a novel interactive framework for fine-grained, multi-granularity 3D segmentation. Additionally, GeoPurify proposes a data-efficient geometric distillation framework for open-vocabulary 3D segmentation, achieving superior data efficiency and state-of-the-art performance on major 3D benchmarks.

Sources

PartSAM: A Scalable Promptable Part Segmentation Model Trained on Native 3D Data

Polysemous Language Gaussian Splatting via Matching-based Mask Lifting

Diff-3DCap: Shape Captioning with Diffusion Models

ASIA: Adaptive 3D Segmentation using Few Image Annotations

PinPoint3D: Fine-Grained 3D Part Segmentation from a Few Clicks

PhraseStereo: The First Open-Vocabulary Stereo Image Segmentation Dataset

GeoPurify: A Data-Efficient Geometric Distillation Framework for Open-Vocabulary 3D Segmentation

Built with on top of