Advances in Human-Robot Interaction and Embodied Agents

The field of human-robot interaction and embodied agents is moving towards more practical and generalizable solutions. Recent research has focused on addressing the challenges of vagueness in human instructions, open-world compositional zero-shot learning, and cross-task generalization in manipulation tasks. Notable papers in this area include REI-Bench, which proposes a benchmark for robot task planning with vague referring expressions and achieves state-of-the-art performance with a simple yet effective approach. Feasibility with Language Models for Open-World Compositional Zero-Shot Learning demonstrates the effectiveness of leveraging large language models to determine the feasibility of state-object combinations. The AutoBio simulation framework and benchmark, and the AGNOSTOS simulation benchmark, are also notable for evaluating robotic automation in biology laboratory environments and cross-task zero-shot generalization in manipulation, respectively.

Sources

REI-Bench: Can Embodied Agents Understand Vague Human Instructions in Task Planning?

Feasibility with Language Models for Open-World Compositional Zero-Shot Learning

AutoBio: A Simulation and Benchmark for Robotic Automation in Digital Biology Laboratory

Mouse Lockbox Dataset: Behavior Recognition for Mice Solving Lockboxes

Exploring the Limits of Vision-Language-Action Manipulations in Cross-task Generalization

From Grounding to Manipulation: Case Studies of Foundation Model Integration in Embodied Robotic Systems

ManipLVM-R1: Reinforcement Learning for Reasoning in Embodied Manipulation with Large Vision-Language Models

Built with on top of