Advancements in 3D Scene Understanding and Reconstruction

The field of 3D scene understanding and reconstruction is rapidly advancing, with a focus on developing more efficient and accurate methods for processing and analyzing 3D data. Recent research has explored the use of novel frameworks and techniques, such as Gaussian splatting, point cloud processing, and semantic scene graph generation, to improve the accuracy and speed of 3D scene reconstruction. Additionally, there is a growing interest in developing methods for open-vocabulary 3D instance segmentation, which enables the recognition and segmentation of objects in 3D scenes without requiring prior knowledge of the object categories. Noteworthy papers in this area include ScenePainter, which proposes a new framework for semantically consistent 3D scene generation, and VisHall3D, which introduces a novel two-stage framework for monocular semantic scene completion. Other notable papers include FROSS, which presents an innovative approach for online and faster-than-real-time 3D semantic scene graph generation, and NeuroVoxel-LM, which proposes a novel framework for language-aligned 3D perception via dynamic voxelization and meta-embedding.

Sources

Part Segmentation of Human Meshes via Multi-View Human Parsing

Equivariant Volumetric Grasping

ScenePainter: Semantically Consistent Perpetual 3D Scene Generation with Concept Relation Alignment

3DGauCIM: Accelerating Static/Dynamic 3D Gaussian Splatting via Digital CIM for High Frame Rate Real-Time Edge Rendering

VisHall3D: Monocular Semantic Scene Completion from Reconstructing the Visible Regions to Hallucinating the Invisible Regions

Querying Autonomous Vehicle Point Clouds: Enhanced by 3D Object Counting with CounterNet

Foundation Model-Driven Grasping of Unknown Objects via Center of Gravity Estimation

BEV-LLM: Leveraging Multimodal BEV Maps for Scene Captioning in Autonomous Driving

GS-Occ3D: Scaling Vision-only Occupancy Reconstruction for Autonomous Driving with Gaussian Splatting

Co-Win: Joint Object Detection and Instance Segmentation in LiDAR Point Clouds via Collaborative Window Processing

Taking Language Embedded 3D Gaussian Splatting into the Wild

FROSS: Faster-than-Real-Time Online 3D Semantic Scene Graph Generation from RGB-D Images

NeuroVoxel-LM: Language-Aligned 3D Perception via Dynamic Voxelization and Meta-Embedding

VESPA: Towards un(Human)supervised Open-World Pointcloud Labeling for Autonomous Driving

GaRe: Relightable 3D Gaussian Splatting for Outdoor Scenes from Unconstrained Photo Collections

Methods for the Segmentation of Reticular Structures Using 3D LiDAR Data: A Comparative Evaluation

SLTarch: Towards Scalable Point-Based Neural Rendering by Taming Workload Imbalance and Memory Irregularity

No Redundancy, No Stall: Lightweight Streaming 3D Gaussian Splatting for Real-time Rendering

Contrast-Prior Enhanced Duality for Mask-Free Shadow Removal

Ov3R: Open-Vocabulary Semantic 3D Reconstruction from RGB Videos

Graph-Guided Dual-Level Augmentation for 3D Scene Segmentation

Viser: Imperative, Web-based 3D Visualization in Python

Details Matter for Indoor Open-vocabulary 3D Instance Segmentation

MagicRoad: Semantic-Aware 3D Road Surface Reconstruction via Obstacle Inpainting

FastPoint: Accelerating 3D Point Cloud Model Inference via Sample Point Distance Prediction

I2V-GS: Infrastructure-to-Vehicle View Transformation with Gaussian Splatting for Autonomous Driving Data Generation

RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping

SeqAffordSplat: Scene-level Sequential Affordance Reasoning on 3D Gaussian Splatting