Advancements in Autonomous Systems with Large Language Models

The field of autonomous systems is witnessing significant advancements with the integration of Large Language Models (LLMs). Recent developments indicate a shift towards more robust, adaptive, and generalizable autonomous agents. The use of LLMs is enabling autonomous systems to better understand and interact with their environment, making them more reliable and efficient. Notably, the incorporation of LLMs in robotics and autonomous driving is leading to improved performance in complex tasks such as task planning, motion planning, and decision-making. Furthermore, the development of frameworks that combine LLMs with other techniques, such as reinforcement learning and computer vision, is enhancing the capabilities of autonomous systems. Overall, the field is moving towards more intelligent and autonomous systems that can operate effectively in real-world environments. Noteworthy papers include: Think, Act, Learn, which introduces a novel framework for autonomous robotic agents using closed-loop LLMs, achieving a 97% success rate on complex tasks. VLMPlanner is also notable, as it integrates visual language models with motion planning for autonomous driving, demonstrating superior planning performance in challenging scenarios.

Sources

Think, Act, Learn: A Framework for Autonomous Robotic Agents using Closed-Loop Large Language Models

VLMPlanner: Integrating Visual Language Models with Motion Planning

LLMs-guided adaptive compensator: Bringing Adaptivity to Automatic Control Systems with Large Language Models

Uncertainty-aware Planning with Inaccurate Models for Robotized Liquid Handling

A Human-in-the-loop Approach to Robot Action Replanning through LLM Common-Sense Reasoning

DriveAgent-R1: Advancing VLM-based Autonomous Driving with Hybrid Thinking and Active Perception

Pretraining a Unified PDDL Domain from Real-World Demonstrations for Generalizable Robot Task Planning

CoEx -- Co-evolving World-model and Exploration

Early Goal-Guided Multi-Scale Fusion for Real-Time Vision-Language Driving

Vision-Language Fusion for Real-Time Autonomous Driving: Goal-Centered Cross-Attention of Camera, HD-Map, & Waypoints

A Unified Perception-Language-Action Framework for Adaptive Autonomous Driving

Can LLM-Reasoning Models Replace Classical Planning? A Benchmark Study

Scalable Multi-Task Reinforcement Learning for Generalizable Spatial Intelligence in Visuomotor Agents

SimuRA: Towards General Goal-Oriented Agent via Simulative Reasoning Architecture with LLM-Based World Model