The field of 3D scene understanding is rapidly advancing, with a focus on developing more accurate and efficient methods for scene reconstruction, segmentation, and completion. Recent research has explored the use of deep learning techniques, such as PointNet and Transformer-based architectures, to improve the accuracy and robustness of 3D scene understanding models. Additionally, there is a growing interest in developing methods that can handle complex and dynamic scenes, such as those found in natural environments or urban areas. Noteworthy papers in this area include BuildingBRep-11K, which introduces a large dataset of multi-storey buildings for training and evaluating 3D scene understanding models. Another notable paper is IPFormer, which proposes a context-adaptive instance proposal approach for vision-based 3D Panoptic Scene Completion. Furthermore, PanSt3R presents a unified and integrated approach for multi-view consistent panoptic segmentation, achieving state-of-the-art performance on several benchmarks.