Advancements in Autonomous Learning and Planning for Large Language Models

The field of large language models (LLMs) is rapidly advancing, with a focus on autonomous learning and planning. Recent developments have led to the creation of modular pipelines that can continuously update an LLM's knowledge with minimal human intervention. These pipelines enable long-term continual learning, allowing LLMs to adapt to rapidly evolving domains and improve their performance over time. Another key area of research is the development of multi-agent frameworks that can handle complex tasks such as index recommendation and planning. These frameworks have been shown to outperform traditional methods and achieve state-of-the-art results. Furthermore, researchers are exploring new learning paradigms that eliminate the need for fine-tuning underlying LLMs, enabling low-cost continual adaptation and real-time learning. Noteworthy papers include ALAS, which presents a modular pipeline for autonomous learning, and MAAdvisor, which proposes a zero-shot LLM-based index advisor with a multi-agent framework. Additionally, AgentFly introduces a novel learning paradigm for adaptive LLM agents that eliminates the need for fine-tuning, and Learn to Memorize proposes an adaptive memory framework for optimizing LLM-based agents. These advancements have the potential to significantly improve the performance and efficiency of LLMs, enabling them to adapt to complex and dynamic environments.

Sources

ALAS: Autonomous Learning Agent for Self-Updating Language Models

MAAdvisor: Zero-Shot Index Advisor using Multi-Agent LLMs

AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

Learn to Memorize: Optimizing LLM-based Agents with Adaptive Memory Framework

Recall-Extend Dynamics: Enhancing Small Language Models through Controlled Exploration and Refined Offline Integration

KL-Regularised Q-Learning: A Token-level Action-Value perspective on Online RLHF

Detecting and Characterizing Planning in Language Models

Better Language Model-Based Judging Reward Modeling through Scaling Comprehension Boundaries

Language Models For Generalised PDDL Planning: Synthesising Sound and Programmatic Policies

Building Self-Evolving Agents via Experience-Driven Lifelong Learning: A Framework and Benchmark

HiPlan: Hierarchical Planning for LLM-Based Agents with Adaptive Global-Local Guidance

Encouraging Good Processes Without the Need for Good Answers: Reinforcement Learning for LLM Agent Planning

Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning

Can Compact Language Models Search Like Agents? Distillation-Guided Policy Optimization for Preserving Agentic RAG Capabilities

AI-SearchPlanner: Modular Agentic Search via Pareto-Optimal Multi-Objective Reinforcement Learning