Backdoor Attacks and Defenses in Machine Learning

The field of machine learning is witnessing a surge in research on backdoor attacks and defenses. Recent studies have shown that deep neural networks are vulnerable to backdoor attacks, where a designed trigger is injected into the dataset, causing erroneous predictions when activated. In response, researchers have proposed various defense mechanisms to detect and mitigate these attacks. One notable direction is the development of latent-driven backdoor attacks, which enable attackers to select arbitrary targets without retraining and evade conventional detection mechanisms. Another area of focus is the creation of effective defense frameworks that can remove backdoor triggers without requiring model access. Noteworthy papers in this area include BadBlocks, which proposes a novel type of backdoor threat that is more lightweight and covert than existing approaches, and BDFirewall, which introduces a progressive defense framework that removes backdoor triggers from the most conspicuous to the most subtle. Other notable papers include FLAT, which proposes a latent-driven arbitrary-target backdoor attack in federated learning, and Isolate Trigger, which introduces a precise and efficient detection and defense framework against evade-adaptive backdoor attacks.

Backdoor Attacks and Defenses in Machine Learning

Sources