Advances in LLM Safety and Compliance

The field of Large Language Models (LLMs) is rapidly evolving, with a growing focus on safety and compliance. Recent research has highlighted the importance of developing rigorous and systematic methods for ensuring LLM safety, particularly in applications where LLMs are used to generate policy briefs or provide legal advice. A key direction in this area is the development of benchmarks and evaluation metrics that can assess the safety and compliance of LLMs. Notably, researchers are exploring the use of legal frameworks and compliance standards to define and measure safety compliance. Another area of focus is the development of runtime verification frameworks that can provide continuous, quantitative assurance of LLM safety. Overall, the field is moving towards a more comprehensive and systematic approach to LLM safety and compliance. Noteworthy papers in this area include: Sci2Pol, which proposes a benchmark and training dataset for evaluating and fine-tuning LLMs on policy brief generation, and Safety Compliance, which develops a new benchmark for safety compliance and aligns LLMs with legal standards to mitigate safety risks. Additionally, GSPR proposes a Generalizable Safety Policy Reasoner that can identify unsafe input prompts and LLMs' outputs with violated safety taxonomies, and AIReg-Bench introduces a benchmark dataset to test the performance of LLMs in assessing compliance with AI regulations.

Sources

Sci2Pol: Evaluating and Fine-tuning LLMs on Scientific-to-Policy Brief Generation

Safety Compliance: Rethinking LLM Safety Reasoning through the Lens of Compliance

AgentGuard: Runtime Verification of AI Agents

GSPR: Aligning LLM Safeguards as Generalizable Safety Policy Reasoners

MASLegalBench: Benchmarking Multi-Agent Systems in Deductive Legal Reasoning

AIReg-Bench: Benchmarking Language Models That Assess AI Regulation Compliance

Built with on top of