The field of world models and interactive environments is moving towards more precise and generalizable representations of complex dynamics and behaviors. Researchers are exploring novel approaches to learn neuro-symbolic world models from gameplay video and other interactive data, enabling more efficient and explainable transfer of learned environment dynamics. Noteworthy papers include:
- Finite Automata Extraction, which proposes a neuro-symbolic world model learning approach from gameplay video, achieving more precise and generalizable results than prior methods.
- Matrix-Game 2.0, which presents an open-source, real-time, and streaming interactive world model that generates high-quality videos on-the-fly via few-step auto-regressive diffusion.
- Pixels to Play, which introduces a foundation model that learns to play a wide range of 3D video games with recognizable human-like behavior, demonstrating competent play across simple and classic titles.