Advances in Generalist Robotics

The field of robotics is moving towards the development of generalist robots that can perform a wide range of tasks in various environments. Recent research has focused on training robots to imitate human actions, conditioned on sensor observations and textual instructions, and to learn from large-scale human videos. This has led to significant improvements in the ability of robots to generalize to novel objects, environments, and instructions involving abstract concepts. Notable papers in this area include:

  • Reinforcement Learning for Flow-Matching Policies, which introduces a new approach to training flow-matching policies via reinforcement learning.
  • GR-3 Technical Report, which presents a large-scale vision-language-action model that showcases exceptional capabilities in generalizing to novel objects and environments.
  • Being-H0: Vision-Language-Action Pretraining from Large-Scale Human Videos, which proposes a novel training paradigm that combines large-scale VLA pretraining from human videos with physical space alignment and post-training adaptation for robotic tasks.

Sources

Reinforcement Learning for Flow-Matching Policies

GR-3 Technical Report

Being-H0: Vision-Language-Action Pretraining from Large-Scale Human Videos

EMP: Executable Motion Prior for Humanoid Robot Standing Upper-body Motion Imitation

The Wilhelm Tell Dataset of Affordance Demonstrations

Built with on top of