Deepfake Detection and Localization Advances

The field of deepfake detection and localization is moving towards more innovative and effective methods, particularly in addressing the challenges of partial deepfake speech detection and weakly-supervised temporal forgery localization. Researchers are exploring new perspectives, such as analyzing frame-level temporal differences and exploiting context-dependent duration features, to improve detection performance. Novel frameworks and models, including those utilizing multimodal interaction mechanisms and extensible deviation perceiving losses, are being proposed to achieve more accurate and robust results. Notable papers include:

  • Frame-level Temporal Difference Learning for Partial Deepfake Speech Detection, which introduces a Temporal Difference Attention Module to redefine partial deepfake detection without relying on explicit boundary annotations.
  • A Multimodal Deviation Perceiving Framework for Weakly-Supervised Temporal Forgery Localization, which presents a novel multimodal interaction mechanism and an extensible deviation perceiving loss to perceive multimodal deviation.

Sources

Frame-level Temporal Difference Learning for Partial Deepfake Speech Detection

Exploiting Context-dependent Duration Features for Voice Anonymization Attack Systems

LENS-DF: Deepfake Detection and Temporal Localization for Long-Form Noisy Speech

A Multimodal Deviation Perceiving Framework for Weakly-Supervised Temporal Forgery Localization

Built with on top of