Advancements in Large Language Models

The field of large language models (LLMs) is rapidly evolving, with a focus on improving their reasoning and knowledge-intensive capabilities. Recent developments have centered around integrating external knowledge and enhancing the models' ability to perform complex tasks, such as multi-hop question answering and claim verification. Notably, researchers are exploring novel approaches to mitigate internal bias in LLMs and improve their factual robustness. Additionally, there is a growing interest in developing more robust and reliable LLMs that can withstand adversarial attacks and maintain their performance in real-world applications.

Some noteworthy papers in this area include: KG-o1, which proposes a four-stage approach to integrate knowledge graphs into LLMs for enhanced multi-hop reasoning abilities. Self-Disguise Attack, which introduces a novel approach to enable LLMs to disguise their output and evade detection by classifiers. From Confidence to Collapse in LLM Factual Robustness, which presents a principled approach to measure factual robustness in LLMs and highlights the importance of considering the generation process. Unbiased Reasoning for Knowledge-Intensive Tasks in Large Language Models via Conditional Front-Door Adjustment, which proposes a novel causal prompting framework to enable unbiased estimation of the causal effect between the query and the answer. Graph-R1, which introduces a GNN-free approach to reformulate graph tasks as textual reasoning problems solved by LLMs and demonstrates the effectiveness of explicit reasoning for graph learning. SSFO, which proposes a self-supervised alignment approach for enhancing RAG faithfulness and demonstrates significant improvements over existing methods. UniC-RAG, which proposes a universal knowledge corruption attack against RAG systems and highlights the need for new defense mechanisms. EMMM, which proposes an explanation-then-detection framework for trustworthy MGT detection and demonstrates competitive accuracy and low latency. Disabling Self-Correction in Retrieval-Augmented Generation via Stealthy Retriever Poisoning, which introduces a new poisoning paradigm that compromises the retriever itself to suppress the self-correction ability of modern LLMs. Graph-R1: Unleashing LLM Reasoning with NP-Hard Graph Problems, which develops a two-stage post-training framework using NP-hard graph problems as a novel synthetic training corpus. Fact or Facsimile? Evaluating the Factual Robustness of Modern Retrievers, which evaluates the factual competence of dense retrievers and rerankers and reveals a systematic trade-off introduced by contrastive learning for retrievers. Enhancing Health Fact-Checking with LLM-Generated Synthetic Data, which proposes a synthetic data generation pipeline that leverages LLMs to augment training data for health-related fact checking.

Sources

KG-o1: Enhancing Multi-hop Question Answering in Large Language Models via Knowledge Graph Integration

Self-Disguise Attack: Induce the LLM to disguise itself for AIGT detection evasion

From Confidence to Collapse in LLM Factual Robustness

If We May De-Presuppose: Robustly Verifying Claims through Presupposition-Free Question Decomposition

Unbiased Reasoning for Knowledge-Intensive Tasks in Large Language Models via Conditional Front-Door Adjustment

Graph-R1: Incentivizing the Zero-Shot Graph Learning Capability in LLMs via Explicit Reasoning

DS@GT at CheckThat! 2025: A Simple Retrieval-First, LLM-Backed Framework for Claim Normalization

SSFO: Self-Supervised Faithfulness Optimization for Retrieval-Augmented Generation

UniC-RAG: Universal Knowledge Corruption Attacks to Retrieval-Augmented Generation

EMMM, Explain Me My Model! Explainable Machine Generated Text Detection in Dialogues

Disabling Self-Correction in Retrieval-Augmented Generation via Stealthy Retriever Poisoning

Graph-R1: Unleashing LLM Reasoning with NP-Hard Graph Problems

Fact or Facsimile? Evaluating the Factual Robustness of Modern Retrievers

Enhancing Health Fact-Checking with LLM-Generated Synthetic Data

Built with on top of