Large Language Models in Interactive and Multimodal Learning

The field of large language models (LLMs) is moving towards more interactive and multimodal learning approaches. Researchers are exploring ways to integrate LLMs with reinforcement learning, enabling models to learn from interactions and improve their performance in complex tasks. This shift is driven by the need for more effective and efficient learning methods, particularly in domains where data is scarce or difficult to obtain. Notable papers in this area include:

  • Cache-Efficient Posterior Sampling for Reinforcement Learning with LLM-Derived Priors Across Discrete and Continuous Domains, which presents a framework for efficient posterior sampling with LLM-derived priors.
  • MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering, which introduces an interactive environment for systematically reinforcement learning and evaluating LLM agents.
  • Self Rewarding Self Improving, which demonstrates that LLMs can effectively self-improve through self-judging without requiring reference solutions.

Sources

Towards Developmentally Plausible Rewards: Communicative Success as a Learning Signal for Interactive Language Models

Cache-Efficient Posterior Sampling for Reinforcement Learning with LLM-Derived Priors Across Discrete and Continuous Domains

MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering

Putting It All into Context: Simplifying Agents with LCLMs

Self Rewarding Self Improving

ImagineBench: Evaluating Reinforcement Learning with Large Language Model Rollouts

MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning

Built with on top of