Advancements in Multimodal Safety and Security Research

The field of multimodal safety and security is rapidly evolving, with a growing focus on developing robust defenses against increasingly sophisticated attacks. Recent research has emphasized the importance of considering multiple modalities, including text, images, and audio, in order to effectively mitigate potential threats. One notable trend is the development of unified frameworks that can handle diverse modalities and tasks, providing a more comprehensive approach to safety and security. Additionally, there is a growing recognition of the need for transparent and interpretable models, as well as the importance of addressing the limitations of language-specific guardrails. Noteworthy papers in this area include: DefenSee, which proposes a robust and lightweight multi-modal defense technique, and OmniGuard, which introduces a unified framework for omni-modal guardrails with deliberate reasoning ability. Other notable works include Aetheria, which presents a multimodal interpretable content safety framework, and CREST, which develops a parameter-efficient multilingual safety classification model.

Sources

DefenSee: Dissecting Threat from Sight and Text - A Multi-View Defensive Pipeline for Multi-modal Jailbreaks

OmniGuard: Unified Omni-Modal Guardrails with Deliberate Reasoning

COGNITION: From Evaluation to Defense against Multimodal LLM CAPTCHA Solvers

VACoT: Rethinking Visual Data Augmentation with VLMs

Aetheria: A multimodal interpretable content safety framework based on multi-agent debate and collaboration

CREST: Universal Safety Guardrails Through Cluster-Guided Cross-Lingual Transfer

Contextual Image Attack: How Visual Context Exposes Multimodal Safety Vulnerabilities

Dynamic Optical Test for Bot Identification (DOT-BI): A simple check to identify bots in surveys and online processes

Counterfeit Answers: Adversarial Forgery against OCR-Free Document Visual Question Answering

Chameleon: Adaptive Adversarial Agents for Scaling-Based Visual Prompt Injection in Multimodal AI Systems

Built with on top of