Large Language Models in Reinforcement Learning

The field of reinforcement learning is witnessing significant advancements with the integration of Large Language Models (LLMs). A key direction is the use of LLMs to enhance reward functions, which is crucial for the success of RL models. This approach has shown promise in various applications, including auto-bidding, e-commerce payment fraud detection, and robotic hand design. The use of LLMs enables the refinement of reward functions, leading to improved performance and robustness. Another notable trend is the development of bi-level frameworks that combine LLMs with other techniques, such as graph-based reasoning and visual language models, to improve the accuracy and effectiveness of reward functions. Noteworthy papers in this area include Enhancing Generative Auto-bidding with Offline Reward Evaluation and Policy Search, which proposes a novel method for integrating generative planning and policy optimization, and Reward Evolution with Graph-of-Thoughts, which introduces a bi-level framework for automated reward design.

Sources

Enhancing Generative Auto-bidding with Offline Reward Evaluation and Policy Search

Reward Evolution with Graph-of-Thoughts: A Bi-Level Language Model Framework for Reinforcement Learning

LLM-Enhanced Self-Evolving Reinforcement Learning for Multi-Step E-Commerce Payment Fraud Risk Detection

Lang2Morph: Language-Driven Morphological Design of Robotic Hands

Online Process Reward Leanring for Agentic Reinforcement Learning

Built with on top of