The field of face security and deepfake detection is rapidly evolving, with a focus on developing more robust and generalizable methods for detecting and preventing facial manipulation. Recent research has explored the use of synthetic data, incremental learning, and multimodal analysis to improve the accuracy and reliability of face security systems. The development of new benchmarks and datasets, such as DREAM and ImmerIris, has also facilitated advancements in this area. Notably, the introduction of innovative methods like SyncLipMAE and PIA has demonstrated significant improvements in deepfake detection and audio-visual synchronization.
Noteworthy papers include: DREAM, which presents a comprehensive benchmark for deepfake realism assessment. SyncLipMAE, which introduces a self-supervised pretraining framework for talking-face video that achieves state-of-the-art results in audio-visual stream synchronization and facial emotion recognition. ImmerIris, which proposes a large-scale dataset and benchmark for immersive iris recognition in open scenes, achieving promising results with a normalization-free paradigm. PIA, which presents a novel multimodal audio-visual framework for deepfake detection that incorporates language, dynamic face motion, and facial identification cues.