Advances in Neural Network Training and Language Model Research

The field of neural network training and language model research is rapidly evolving, with a focus on improving model performance, robustness, and interpretability. Recent studies have investigated the importance of critical learning periods, warm-starting, and learning hyperparameters in neural network training, highlighting the need for careful consideration of these factors to avoid performance loss and achieve optimal results. In the area of language models, researchers have been exploring the relationship between model performance and linguistic complexity, with some studies suggesting that model behavior is driven more by the richness of linguistic resources than by sensitivity to grammatical complexity. Other work has focused on developing new methods for analyzing and interpreting language model behavior, such as the use of input attribution methods to identify the specific words most important for brain-LLM alignment. Noteworthy papers in this area include: The paper 'The Achilles' Heel of LLMs: How Altering a Handful of Neurons Can Cripple Language Abilities' which investigates the existence of critical neurons in large language models and their impact on model performance. The paper 'Hierarchical Frequency Tagging Probe (HFTP): A Unified Approach to Investigate Syntactic Structure Representations in Large Language Models and the Human Brain' which introduces a new tool for analyzing syntactic structure representations in large language models and the human brain.

Sources

On the Occurence of Critical Learning Periods in Neural Networks

The Achilles' Heel of LLMs: How Altering a Handful of Neurons Can Cripple Language Abilities

Fine-grained Analysis of Brain-LLM Alignment through Input Attribution

Resource-sensitive but language-blind: Community size and not grammatical complexity better predicts the accuracy of Large Language Models in a novel Wug Test

Which Word Orders Facilitate Length Generalization in LMs? An Investigation with GCG-Based Artificial Languages

Language Models Model Language

Hierarchical Frequency Tagging Probe (HFTP): A Unified Approach to Investigate Syntactic Structure Representations in Large Language Models and the Human Brain

Investigating Lexical Change through Cross-Linguistic Colexification Patterns

Readability $\ne$ Learnability: Rethinking the Role of Simplicity in Training Small Language Models

Think Globally, Group Locally: Evaluating LLMs Using Multi-Lingual Word Grouping Games

Quantifying Phonosemantic Iconicity Distributionally in 6 Languages

Built with on top of