The field of large language models (LLMs) is moving towards more dynamic and adaptive architectures, allowing for improved efficiency and performance. Researchers are exploring novel methods to scale models, such as modular composition and layer-wise expansion, which enable the creation of more capable models without requiring significant additional resources. Another key area of focus is the development of techniques to adapt model architectures to specific inputs or tasks, such as input-conditioned layer dropping and test-time depth adaptation. These innovations have the potential to unlock significant improvements in model performance and efficiency. Noteworthy papers include:
- Growing Transformers, which demonstrates the ability to merge specialist models into a single, more capable model, and introduces a layer-wise constructive training methodology.
- Skip a Layer or Loop it, which presents a method for test-time depth adaptation of pretrained LLMs, allowing for customized models for each test sample.