Continual Learning and Memorization in Large Language Models

The field of large language models is moving towards developing models that can continually learn and adapt over time without suffering from catastrophic forgetting. Recent research has focused on exploring techniques such as sparse parameter updates, gated continual learning, and model merging to mitigate forgetting and improve model reliability. Another key area of research is understanding how language models memorize and retain knowledge, with studies investigating the impact of pretraining diversity, data scale, and model architecture on memorization risks. Noteworthy papers in this area include Continual Learning via Sparse Memory Finetuning, which introduces a novel approach to sparse parameter updates, and Hubble, a model suite designed to advance the study of LLM memorization. Overall, the field is making progress towards developing more robust and adaptable language models that can learn and improve over time.

Sources

Continual Learning via Sparse Memory Finetuning

Emergence of Linear Truth Encodings in Language Models

Facts in Stats: Impacts of Pretraining Diversity on Language Model Generalization

STABLE: Gated Continual Learning for Large Language Models

On the Impossibility of Retrain Equivalence in Machine Unlearning

MemoryBench: A Benchmark for Memory and Continual Learning in LLM Systems

Navigating the Alignment-Calibration Trade-off: A Pareto-Superior Frontier via Model Merging

Mapping Post-Training Forgetting in Language Models at Scale

Imbalanced Gradients in RL Post-Training of Multi-Task LLMs

Conditions for Catastrophic Forgetting in Multilingual Translation

Hubble: a Model Suite to Advance the Study of LLM Memorization

Built with on top of