The field of robotic manipulation is moving towards more realistic and responsible behavior, with a focus on physical reliability, generalization, and safety in decision-making. Recent developments have introduced new benchmarks and datasets that evaluate the performance of robotic systems in various tasks, such as appliance manipulation, rigid-object manipulation, and bin packing. These benchmarks provide a foundation for advancing the development of trustworthy and real-world robotic systems. Noteworthy papers include:
- RealAppliance, which introduces a dataset and benchmark for evaluating multimodal large language models and embodied manipulation planning models in appliance manipulation tasks.
- ResponsibleRobotBench, which provides a systematic benchmark for evaluating responsible robotic manipulation using multi-modal large language models.
- RoboBPP, which introduces a benchmarking system for robotic online bin packing with physics-based simulation.