Enhancing Security and Privacy in Large Language Models

The field of large language models (LLMs) is witnessing significant advancements in security and privacy, with a growing focus on mitigating risks and ensuring compliance with evolving security requirements. Researchers are exploring innovative approaches to enforce policy compliance, detect and prevent membership inference attacks, and develop secure frameworks for LLM interactions. Notably, the development of capability-based sandboxes, permissioned LLMs, and stealthy backdoor sample detection methods are contributing to the advancement of the field. These efforts aim to strike a balance between usability, control, and security, ultimately supporting the widespread adoption of LLMs in sensitive domains. Noteworthy papers include:

  • LLM Access Shield, which proposes a security framework for policy compliance and sensitive data anonymization.
  • Permissioned LLMs, which introduces a novel class of LLMs that enforce access control structures on query responses.
  • Detecting Stealthy Backdoor Samples, which presents a method based on intra-class distance for detecting poisoned samples in LLMs.

Sources

LLM Access Shield: Domain-Specific LLM Framework for Privacy Policy Compliance

Safeguarding Privacy of Retrieval Data against Membership Inference Attacks: Is This Query Too Close to Home?

Operationalizing CaMeL: Strengthening LLM Defenses for Enterprise Deployment

Permissioned LLMs: Enforcing Access Control in Large Language Models

Detecting Stealthy Backdoor Samples based on Intra-class Distance for Large Language Models

Bayesian Perspective on Memorization and Reconstruction

Built with on top of