Advances in Compositional Generalization

The field of artificial intelligence is moving towards developing models that can generalize compositionally, enabling them to synthesize novel skills from known components. Recent research has focused on understanding how reinforcement learning (RL) contributes to this ability, with findings suggesting that RL can induce out-of-distribution generalization and compositional reuse of subtasks. Notably, the development of novel architectures and methods, such as looped locate-and-replace pipelines, has shown promising results in addressing the challenge of depth generalization in recursive logic tasks. Moreover, research has highlighted the importance of decoupling atomic skills and using RL to synthesize complex reasoning strategies, demonstrating that this approach can lead to significant improvements in generalization capabilities. Additionally, the investigation of sequential enumeration abilities in large language models has revealed a persistent gap between neural and symbolic approaches to compositional generalization.

Some noteworthy papers include: The paper 'From Atomic to Composite: Reinforcement Learning Enables Generalization in Complementary Reasoning' finds that RL acts as a reasoning synthesizer rather than a probability amplifier, and that it can synthesize complex strategies from learned primitives without explicit supervision. The paper 'Exploring Depth Generalization in Large Language Models for Solving Recursive Logic Tasks' develops a novel looped locate-and-replace pipeline that effectively alleviates performance decay when tested on out-of-distribution recursion depth.

Sources

How Does RL Post-training Induce Skill Composition? A Case Study on Countdown

From Atomic to Composite: Reinforcement Learning Enables Generalization in Complementary Reasoning

Exploring Depth Generalization in Large Language Models for Solving Recursive Logic Tasks

Understanding LLM Reasoning for Abstractive Summarization

SkillFactory: Self-Distillation For Learning Cognitive Behaviors

Sequential Enumeration in Large Language Models

Built with on top of