Backdoor Attacks and Defenses in Machine Learning

The field of machine learning is witnessing a surge in research on backdoor attacks and defenses. Recent studies have shown that deep neural networks are vulnerable to backdoor attacks, where a designed trigger is injected into the dataset, causing erroneous predictions when activated. In response, researchers have proposed various defense mechanisms to detect and mitigate these attacks. One notable direction is the development of latent-driven backdoor attacks, which enable attackers to select arbitrary targets without retraining and evade conventional detection mechanisms. Another area of focus is the creation of effective defense frameworks that can remove backdoor triggers without requiring model access. Noteworthy papers in this area include BadBlocks, which proposes a novel type of backdoor threat that is more lightweight and covert than existing approaches, and BDFirewall, which introduces a progressive defense framework that removes backdoor triggers from the most conspicuous to the most subtle. Other notable papers include FLAT, which proposes a latent-driven arbitrary-target backdoor attack in federated learning, and Isolate Trigger, which introduces a precise and efficient detection and defense framework against evade-adaptive backdoor attacks.

Sources

BadBlocks: Low-Cost and Stealthy Backdoor Attacks Tailored for Text-to-Image Diffusion Models

BDFirewall: Towards Effective and Expeditiously Black-Box Backdoor Defense in MLaaS

FLAT: Latent-Driven Arbitrary-Target Backdoor Attacks in Federated Learning

Isolate Trigger: Detecting and Eradicating Evade-Adaptive Backdoors

BadTime: An Effective Backdoor Attack on Multivariate Long-Term Time Series Forecasting

DocVCE: Diffusion-based Visual Counterfactual Explanations for Document Image Classification

NT-ML: Backdoor Defense via Non-target Label Training and Mutual Learning

From Detection to Correction: Backdoor-Resilient Face Recognition via Vision-Language Trigger Detection and Noise-Based Neutralization

Non-omniscient backdoor injection with a single poison sample: Proving the one-poison hypothesis for linear regression and linear classification

Built with on top of