Privacy Risks in Deep Learning and Large Language Models

The field of deep learning and large language models is experiencing a significant shift towards acknowledging and addressing the privacy risks associated with these technologies. Recent research has highlighted the potential for privacy leakage in contrastive learning frameworks, as well as the propagation of biases in synthetic tabular data generation using large language models. Furthermore, studies have shown that large language models can be vulnerable to property inference attacks, which can reveal confidential properties of the training data. To mitigate these risks, researchers are exploring novel defense techniques, such as selective data obfuscation and membership inference attack methods. Noteworthy papers in this area include:

  • A study that proposes a novel membership inference attack method based on the p-norm of feature vectors, which outperforms existing methods in attack performance and robustness.
  • A paper that introduces an adversarial scenario where a malicious contributor can inject bias into the synthetic dataset via a subset of in-context examples, compromising the fairness of downstream classifiers.
  • A work that proposes a benchmark task for evaluating property inference in large language models and introduces two tailored attacks, demonstrating the success of these attacks in revealing confidential properties of the training data.
  • A research that proposes a novel defense technique, SOFT, which mitigates privacy leakage by leveraging influential data selection with an adjustable parameter to balance utility preservation and privacy protection.

Sources

When Better Features Mean Greater Risks: The Performance-Privacy Trade-Off in Contrastive Learning

In-Context Bias Propagation in LLM-Based Tabular Data Generation

Can We Infer Confidential Properties of Training Data from LLMs?

SOFT: Selective Data Obfuscation for Protecting LLM Fine-tuning against Membership Inference Attacks

Built with on top of