The field of audio security and privacy is rapidly evolving, with a focus on developing robust detection methods for audio deepfakes and improving the privacy of speaker de-identification systems. Recent research has highlighted the vulnerabilities of current audio deepfake detectors and the need for more realistic and challenging datasets. Additionally, there is a growing interest in multimodal fact-checking and the development of benchmarks for evaluating the performance of fact-checking models in audio dialogues.
Noteworthy papers in this area include: Perturbed Public Voices (P$^{2}$V), which introduces a new dataset for robust audio deepfake detection and demonstrates the vulnerabilities of current detectors. MAD, a benchmark for multi-turn audio dialogue fact-checking, which captures the complexity of spoken misinformation and provides a challenging testbed for fact-checking models. Any-to-any Speaker Attribute Perturbation for Asynchronous Voice Anonymization, which proposes a novel approach to speaker anonymization that enhances identity unlinkability among anonymized utterances from the same original speaker.