The field of embodied intelligence and human-robot collaboration is rapidly advancing, with a focus on developing robots that can understand and respond to natural language instructions, adapt to changing environments, and learn from experience. Recent research has explored the use of vision-language models, scene graphs, and proactive replanning to improve robot autonomy and resilience. Notable papers have demonstrated the effectiveness of these approaches in various applications, including object retrieval, navigation, and manipulation. Noteworthy papers include: OVSegDT, which introduces a lightweight transformer policy for open-vocabulary object goal navigation, achieving state-of-the-art results on the HM3D-OVON dataset. Embodied-R1, which pioneers the use of pointing as a unified intermediate representation for embodied reasoning, achieving robust zero-shot generalization on 11 embodied spatial and pointing benchmarks. DEXTER-LLM, which integrates large language models with model-based assignment methods for dynamic task planning in unknown environments, demonstrating exceptional performance in experimental evaluations.
Advances in Embodied Intelligence and Human-Robot Collaboration
Sources
RoboRetriever: Single-Camera Robot Object Retrieval via Active and Interactive Perception with Dynamic Scene Graph
DEXTER-LLM: Dynamic and Explainable Coordination of Multi-Robot Systems in Unknown Environments via Large Language Models
PB-IAD: Utilizing multimodal foundation models for semantic industrial anomaly detection in dynamic manufacturing environments
Towards AI-based Sustainable and XR-based human-centric manufacturing: Implementation of ISO 23247 for digital twins of production systems