Physics-Grounded Video Generation

The field of video generation is moving towards incorporating physical principles to produce more realistic results. Current models are being improved to align with real-world physics, leading to more accurate and reliable video generation. This is achieved through the development of frameworks that refine prompts based on feedback from physical inconsistencies, and methods that estimate static initial physical properties of objects in an image. These innovations enable the generation of high-quality videos with rich dynamic behaviors and physical realism. Noteworthy papers include:

  • Bootstrapping Physics-Grounded Video Generation through VLM-Guided Iterative Self-Refinement, which proposes an iterative self-refinement framework for physics-aware video generation.
  • PhysChoreo: Physics-Controllable Video Generation with Part-Aware Semantic Grounding, which introduces a novel framework for generating videos with diverse controllability and physical realism from a single image.

Sources

Bootstrapping Physics-Grounded Video Generation through VLM-Guided Iterative Self-Refinement

Mistake Attribution: Fine-Grained Mistake Understanding in Egocentric Videos

PhysChoreo: Physics-Controllable Video Generation with Part-Aware Semantic Grounding

Built with on top of