Advances in Secure Data Sharing and Watermarking

The field of secure data sharing and watermarking is rapidly evolving, with a focus on developing innovative methods to protect sensitive information from unauthorized access and misuse. Recent research has highlighted the importance of creating unlearnable examples to prevent the exploitation of datasets, as well as the need for robust watermarking schemes that can resist various attacks. One notable direction is the development of asynchronous event error-minimizing noise to safeguard event datasets, which has shown promising results in preventing unauthorized training from event datasets. Another area of interest is the use of generative AI to improve classifier performance in security tasks, with techniques such as augmenting training datasets with synthetic data generated using GenAI methods. Noteworthy papers in this area include: Disappearing Ink: Obfuscation Breaks N-gram Code Watermarks in Theory and Practice, which formally models code obfuscation and proves the impossibility of N-gram-based watermarking's robustness. Asynchronous Event Error-Minimizing Noise for Safeguarding Event Dataset, which proposes a novel unlearnable event stream generation method to prevent unauthorized training from event datasets. Taming Data Challenges in ML-based Security Tasks: Lessons from Integrating Generative AI, which evaluates the use of GenAI techniques to improve classifier performance in security tasks. Enhancing LLM Watermark Resilience Against Both Scrubbing and Spoofing Attacks, which introduces a novel mechanism to break the trade-off between watermark window size and spoofing attacks. Mitigating Watermark Stealing Attacks in Generative Models via Multi-Key Watermarking, which proposes a multi-key extension to mitigate stealing attacks in generative models.

Sources

Disappearing Ink: Obfuscation Breaks N-gram Code Watermarks in Theory and Practice

Asynchronous Event Error-Minimizing Noise for Safeguarding Event Dataset

Generalized and Unified Equivalences between Hardness and Pseudoentropy

Taming Data Challenges in ML-based Security Tasks: Lessons from Integrating Generative AI

Enhancing LLM Watermark Resilience Against Both Scrubbing and Spoofing Attacks

Semi-fragile watermarking of remote sensing images using DWT, vector quantization and automatic tiling

Shuffling for Semantic Secrecy

Temporal Unlearnable Examples: Preventing Personal Video Data from Unauthorized Exploitation by Object Tracking

Mitigating Watermark Stealing Attacks in Generative Models via Multi-Key Watermarking

Built with on top of