The field of artificial intelligence is moving towards greater transparency and accountability, with a growing focus on explainable AI and interpretable machine learning. Recent research has made significant progress in developing techniques to provide insights into the decision-making processes of complex models, including language models and deep learning systems. One notable direction is the use of symbolic regression and genetic programming to discover compact and interpretable formulas that describe given data. Another area of research is the development of methods for analyzing and explaining the internal representations of large language models, including the use of vector symbolic architectures and contrastive explanations. Additionally, there is a growing interest in exploring alternatives to traditional next token prediction approaches in text generation, such as plan-then-generate and latent reasoning methods. Noteworthy papers in this area include 'From Embeddings to Equations: Genetic-Programming Surrogates for Interpretable Transformer Classification', which proposes a novel approach to obtaining compact and auditable classifiers with calibrated probabilities, and 'Query Circuits: Explaining How Language Models Answer User Prompts', which introduces a new method for tracing the information flow inside a model to explain its output. Overall, the field is moving towards a greater emphasis on transparency, accountability, and interpretability, with significant implications for the development of trustworthy and reliable AI systems.
Advances in Explainable AI and Interpretable Machine Learning
Sources
From Embeddings to Equations: Genetic-Programming Surrogates for Interpretable Transformer Classification
Beyond Formula Complexity: Effective Information Criterion Improves Performance and Interpretability for Symbolic Regression