Geometry-Aware Semantic Scene Understanding

The field of 3D scene understanding is moving towards incorporating geometry-aware semantic features to enhance the accuracy and robustness of various tasks such as object localization, pose estimation, and scene graph prediction. Recent research has focused on developing innovative frameworks that combine visual, semantic, and geometric features to achieve state-of-the-art performance. Notably, the integration of geometry-grounding and uncertainty-aware neural feature fields has shown great promise in improving the reliability and generalizability of 3D scene understanding models. Noteworthy papers include: Geometry Meets Vision: Revisiting Pretrained Semantics in Distilled Fields, which investigates the potential benefits of geometry-grounding in distilled fields and proposes a novel framework for inverting radiance fields. Object-Centric Representation Learning for Enhanced 3D Scene Graph Prediction, which demonstrates the importance of object feature quality in determining scene graph accuracy and proposes a highly discriminative object feature encoder. Progressive Gaussian Transformer with Anisotropy-aware Sampling for Open Vocabulary Occupancy Prediction, which achieves state-of-the-art performance in open-vocabulary 3D occupancy prediction using a progressive Gaussian transformer framework.

Sources

Geometry Meets Vision: Revisiting Pretrained Semantics in Distilled Fields

Joint Neural SDF Reconstruction and Semantic Segmentation for CAD Models

Object-Centric Representation Learning for Enhanced 3D Scene Graph Prediction

Progressive Gaussian Transformer with Anisotropy-aware Sampling for Open Vocabulary Occupancy Prediction

UniFField: A Generalizable Unified Neural Feature Field for Visual, Semantic, and Spatial Uncertainties in Any Scene

Introspection in Learned Semantic Scene Graph Localisation

Built with on top of