The field of natural language processing and educational data mining is witnessing significant developments, particularly in the areas of large language models (LLMs) and knowledge tracing (KT). Researchers are exploring innovative approaches to improve the performance and efficiency of LLMs, such as optimizing training methods and leveraging pretrained representations. In the realm of KT, there is a growing emphasis on developing more effective and interpretable models that can accurately predict student behavior and support teacher decision-making. Noteworthy papers in this area include: Training LLMs Beyond Next Token Prediction - Filling the Mutual Information Gap, which proposes a novel approach to training LLMs by predicting information-rich tokens. Next Token Knowledge Tracing: Exploiting Pretrained LLM Representations to Decode Student Behaviour, which introduces a new method for KT that reframes the task as a next-token prediction problem using pretrained LLMs. Extracting Causal Relations in Deep Knowledge Tracing, which challenges the prevailing explanation for the performance gains of Deep Knowledge Tracing (DKT) and demonstrates that its strength lies in its ability to model prerequisite relationships as a causal structure.