Adversarial Robustness in Vision-Language Models

The field of vision-language models is rapidly advancing, with a growing focus on improving adversarial robustness. Recent research has highlighted the vulnerability of these models to adversarial attacks, which can compromise their performance and reliability. To address this challenge, researchers are exploring new methods to enhance the robustness of vision-language models, including the development of novel attack frameworks and defense strategies. Notably, some papers have proposed innovative approaches to improve the robustness of vision-language models, such as the use of adversarial mixture prompt tuning and zero-shot vision encoder grafting. The paper 'Enhancing Adversarial Robustness of Vision Language Models via Adversarial Mixture Prompt Tuning' is particularly noteworthy, as it presents a novel method to enhance the generalization of vision-language models towards various adversarial attacks. Another notable paper is 'Zero-Shot Vision Encoder Grafting via LLM Surrogates', which proposes a promising strategy to reduce the costs of training vision-language models by leveraging small surrogate models.

Sources

VEAttack: Downstream-agnostic Vision Encoder Attack against Large Vision Language Models

Enhancing Adversarial Robustness of Vision Language Models via Adversarial Mixture Prompt Tuning

EVADE: Multimodal Benchmark for Evasive Content Detection in E-Commerce Applications

Benign-to-Toxic Jailbreaking: Inducing Harmful Responses from Harmless Prompts

Preventing Adversarial AI Attacks Against Autonomous Situational Awareness: A Maritime Case Study

Seeing the Threat: Vulnerabilities in Vision-Language Models to Adversarial Attack

Does Johnny Get the Message? Evaluating Cybersecurity Notifications for Everyday Users

Zero-Shot Vision Encoder Grafting via LLM Surrogates

Uncovering Visual-Semantic Psycholinguistic Properties from the Distributional Structure of Text Embedding Spac

Disrupting Vision-Language Model-Driven Navigation Services via Adversarial Object Fusion

Adversarial Semantic and Label Perturbation Attack for Pedestrian Attribute Recognition

TRAP: Targeted Redirecting of Agentic Preferences

To Trust Or Not To Trust Your Vision-Language Model's Prediction