Advances in 4D Content Generation and Video Editing

The field of computer vision is rapidly advancing, with a focus on generating high-quality 4D content and editing videos. Recent developments have led to the creation of innovative frameworks and models that can produce visually engaging results. One of the key directions in this field is the integration of reconstructed scenes with 4D human animation, allowing for seamless and realistic composites. Another area of research is the development of instruction-based image and video editing models, which enable efficient and interactive editing. Additionally, there is a growing interest in generating 3D stereoscopic and spatial videos for immersive applications. Noteworthy papers in this area include AnimateScene, which addresses the challenges of integrating reconstructed scenes with 4D human animation, and DreamVE, which introduces a unified model for instruction-based image and video editing. Other notable papers include Restage4D, Splat4D, and X2Edit, which demonstrate significant advancements in 4D content generation and video editing.

Sources

AnimateScene: Camera-controllable Animation in Any Scene

DreamVE: Unified Instruction-based Image and Video Editing

Restage4D: Reanimating Deformable 3D Reconstruction from a Single Video

Splat4D: Diffusion-Enhanced 4D Gaussian Splatting for Temporally and Spatially Consistent Content Creation

X2Edit: Revisiting Arbitrary-Instruction Image Editing through Self-Constructed Data and Task-Aware Representation Learning

Dream4D: Lifting Camera-Controlled I2V towards Spatiotemporally Consistent 4D Generation

Generative Video Matting

S^2VG: 3D Stereoscopic and Spatial Video Generation via Denoising Frame Matrix

Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation

Matrix-3D: Omnidirectional Explorable 3D World Generation

MAViS: A Multi-Agent Framework for Long-Sequence Video Storytelling

RealisMotion: Decomposed Human Motion Control and Video Generation in the World Space

Yan: Foundational Interactive Video Generation

Preacher: Paper-to-Video Agentic System

HumanGenesis: Agent-Based Geometric and Generative Modeling for Synthetic Human Dynamics

InterSyn: Interleaved Learning for Dynamic Motion Synthesis in the Wild