The field of multimodal fact-checking and hallucination detection is rapidly evolving, with a focus on developing more robust and accurate methods for verifying the authenticity of multimedia content. Recent research has highlighted the importance of considering both visual and textual information when evaluating the factuality of a claim, as well as the need for more fine-grained analysis of hallucinations in large vision-language models. Noteworthy papers in this area include MM-FusionNet, which introduces a context-aware dynamic fusion module for multi-modal fake news detection, and SHALE, a scalable benchmark for fine-grained hallucination evaluation in large vision-language models. These advancements have significant implications for the development of more reliable and trustworthy multimodal fact-checking systems.
Advances in Multimodal Fact-Checking and Hallucination Detection
Sources
MM-FusionNet: Context-Aware Dynamic Fusion for Multi-modal Fake News Detection with Large Vision-Language Models
Semi-automated Fact-checking in Portuguese: Corpora Enrichment using Retrieval with Claim extraction
ContextGuard-LVLM: Enhancing News Veracity through Fine-grained Cross-modal Contextual Consistency Verification
Bridging Semantic Logic Gaps: A Cognition-Inspired Multimodal Boundary-Preserving Network for Image Manipulation Localization
A Context-aware Attention and Graph Neural Network-based Multimodal Framework for Misogyny Detection
XFacta: Contemporary, Real-World Dataset and Evaluation for Multimodal Misinformation Detection with Multimodal LLMs
HiFACTMix: A Code-Mixed Benchmark and Graph-Aware Model for EvidenceBased Political Claim Verification in Hinglish