The field of large language models (LLMs) is witnessing significant advancements in security and privacy, with a growing focus on mitigating risks and ensuring compliance with evolving security requirements. Researchers are exploring innovative approaches to enforce policy compliance, detect and prevent membership inference attacks, and develop secure frameworks for LLM interactions. Notably, the development of capability-based sandboxes, permissioned LLMs, and stealthy backdoor sample detection methods are contributing to the advancement of the field. These efforts aim to strike a balance between usability, control, and security, ultimately supporting the widespread adoption of LLMs in sensitive domains. Noteworthy papers include:
- LLM Access Shield, which proposes a security framework for policy compliance and sensitive data anonymization.
- Permissioned LLMs, which introduces a novel class of LLMs that enforce access control structures on query responses.
- Detecting Stealthy Backdoor Samples, which presents a method based on intra-class distance for detecting poisoned samples in LLMs.