The field of large language models is moving towards more sophisticated applications in human motion and animation. Recent developments have shown promising results in using large language models to generate and control 3D avatar animations, with a focus on improving planning performance and handling multi-step movements. Additionally, there is a growing interest in leveraging large language models for text-driven map animation prototyping and embodied spatial-temporal reasoning. Notably, the incorporation of long-term spatial-temporal memory in large language models has been identified as a key area for improvement. Overall, the field is advancing towards more nuanced and realistic animations, with potential applications in virtual and augmented reality. Noteworthy papers include:
- A paper introducing MapStory, an LLM-powered animation authoring tool that generates editable map animation sequences directly from natural language text.
- A paper proposing 3DLLM-Mem, a novel dynamic memory management and fusion model for embodied spatial-temporal reasoning and actions in LLMs, which achieves state-of-the-art performance across various tasks.
- A paper presenting a data-driven framework for quality assessment of 3D human animation, leveraging a novel dataset and achieving a correlation of 90% with subjective realism evaluation scores.