The field of audio and face privacy protection is rapidly evolving, with a focus on developing innovative methods to detect and prevent deepfakes, as well as protecting sensitive information from unauthorized access. Recent research has explored the use of phoneme-level analysis for person-of-interest speech deepfake detection, achieving comparable accuracy to traditional approaches while offering superior robustness and interpretability. Additionally, multi-level strategies for deepfake content moderation have been proposed, combining the strengths of existing methods to provide scalability and practicality. Noteworthy papers include the proposal of a novel forensic machine learning technique for detecting deepfake video impersonations, which leverages unnatural patterns in facial biometrics, and the introduction of Enkidu, a user-oriented privacy-preserving framework that leverages universal frequential perturbations to defend against personalized voice deepfake threats.
Advances in Audio and Face Privacy Protection
Sources
Enforcing Speech Content Privacy in Environmental Sound Recordings using Segment-wise Waveform Reversal
Cross-Modal Watermarking for Authentic Audio Recovery and Tamper Localization in Synthesized Audiovisual Forgeries
Enkidu: Universal Frequential Perturbation for Real-Time Audio Privacy Protection against Voice Deepfakes