Explainable Threat Intelligence and Security in Large Language Models

The field of large language models is moving towards increased explainability and security, with a focus on developing systems that can provide transparent and interpretable results. This is being achieved through the use of knowledge graphs, retrieval-augmented generation, and other techniques that enable models to generate safe and accurate outputs. Noteworthy papers in this area include: Large Language Models for Explainable Threat Intelligence, which proposes a system that uses a large language model with retrieval-augmented generation to obtain threat intelligence and generate explainable results. KG-DF: A Black-box Defense Framework against Jailbreak Attacks Based on Knowledge Graphs, which introduces a knowledge graph defense framework that enhances defense performance against jailbreak attacks while improving response quality in general QA scenarios.

Sources

Large Language Models for Explainable Threat Intelligence

KG-DF: A Black-box Defense Framework against Jailbreak Attacks Based on Knowledge Graphs

Publish Your Threat Models! The benefits far outweigh the dangers

Knowledge Graph Analysis of Legal Understanding and Violations in LLMs

Built with on top of