The field of 3D scene reconstruction and panoramic video generation is moving towards more advanced and innovative methods, with a focus on generative models and diffusion-based approaches. Recent developments have shown remarkable capabilities in generating virtual environments, simulating real-world scenes, and creating dynamic 3D scenes from monocular input videos. The use of self-distillation frameworks, spherical shortest path-based superpixels, and epipolar-aware diffusion models has enabled significant improvements in segmentation accuracy, shape regularity, and geometric consistency. Noteworthy papers include:
- Lyra, which proposes a self-distillation framework for generative 3D scene reconstruction via video diffusion model self-distillation, achieving state-of-the-art performance in static and dynamic 3D scene generation.
- CamPVG, which introduces a diffusion-based framework for panoramic video generation guided by precise camera poses, generating high-quality panoramic videos consistent with camera trajectories.
- PhiGenesis, which presents a unified framework for 4D scene generation that extends video generation techniques with geometric and temporal consistency, achieving state-of-the-art performance in both appearance and geometric reconstruction, temporal generation and novel view synthesis tasks.