Advances in Large Language Models and Knowledge Graphs

The field of natural language processing is witnessing significant advancements in the development of large language models (LLMs) and knowledge graphs. Recent research has focused on improving the efficiency and effectiveness of LLMs in various tasks, including text generation, language understanding, and knowledge extraction. One of the key trends is the use of modular techniques for synthetic long-context data generation, which enables the creation of high-quality training data for LLMs. Additionally, there is a growing interest in incorporating causal knowledge into LLMs to improve their performance in out-of-distribution scenarios. The development of ontology-guided open-domain knowledge extraction systems is also gaining traction, with the potential to automatically extract and ingest large amounts of knowledge from web sources. Noteworthy papers in this area include POINTS-Reader, which proposes a distillation-free framework for constructing high-quality document extraction datasets and models, and CAT, which introduces a novel approach to injecting fine-grained causal knowledge into LLMs. Overall, these advancements have the potential to significantly improve the performance and applicability of LLMs and knowledge graphs in various real-world applications.

Sources

LLM-based Triplet Extraction for Automated Ontology Generation in Software Engineering Standards

POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion

Modular Techniques for Synthetic Long-Context Data Generation in Language Model Training and Evaluation

CAT: Causal Attention Tuning For Injecting Fine-grained Causal Knowledge into Large Language Models

An Epidemiological Knowledge Graph extracted from the World Health Organization's Disease Outbreak News

LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations

Learning Mechanism Underlying NLP Pre-Training and Fine-Tuning

No Clustering, No Routing: How Transformers Actually Process Rare Tokens

Interpreting Transformer Architectures as Implicit Multinomial Regression

ODKE+: Ontology-Guided Open-Domain Knowledge Extraction with LLMs

ACE-RL: Adaptive Constraint-Enhanced Reward for Long-form Generation Reinforcement Learning

Crosscoding Through Time: Tracking Emergence & Consolidation Of Linguistic Representations Throughout LLM Pretraining

Selective Induction Heads: How Transformers Select Causal Structures In Context

Do All Autoregressive Transformers Remember Facts the Same Way? A Cross-Architecture Analysis of Recall Mechanisms