Advances in Data Privacy and Security for Machine Learning

The field of machine learning is moving towards a greater emphasis on data privacy and security, with a focus on developing methods to prevent data leakage and ensure the integrity of trained models. Recent research has explored the use of data cartography to identify and mitigate memorization hotspots in generative models, as well as the development of frameworks for analyzing and detecting data forging in machine unlearning. Additionally, there is a growing interest in understanding the learnability of distribution classes in the presence of adaptive adversaries and the vulnerability of test samples to targeted data poisoning attacks. Noteworthy papers in this area include: Not All Samples Are Equal: Quantifying Instance-level Difficulty in Targeted Data Poisoning, which introduces predictive criteria for targeted data poisoning difficulty, The Measure of Deception: An Analysis of Data Forging in Machine Unlearning, which develops a framework for analyzing the phenomenon of data forging, Generative Data Refinement: Just Ask for Better Data, which proposes a framework for using pretrained generative models to transform datasets with undesirable content into refined datasets.

Advances in Data Privacy and Security for Machine Learning

Sources