Advances in Legal Knowledge Retrieval and Modeling

The field of legal knowledge retrieval and modeling is moving towards increased use of large language models (LLMs) and retrieval-augmented generation (RAG) systems to improve system performance and robustness. Researchers are exploring the application of LLMs to various legal tasks, including legal coding, nanobody-specific modeling, and legal question answering. A key challenge in this area is the lack of realistic legal benchmarks that capture the complexity of both legal retrieval and downstream legal question-answering. To address this, novel legal RAG benchmarks are being introduced, such as Bar Exam QA and Housing Statute QA. Another important direction is the development of methods to bring legal knowledge to the public, including the construction of legal question banks and interactive recommenders. Noteworthy papers in this area include:

NbBench, which introduces a comprehensive benchmark suite for nanobody representation learning, and
QBR, which proposes a question-bank-based approach to fine-grained legal knowledge retrieval for the general public,
Identifying Legal Holdings with LLMs, which presents a systematic study of the performance of modern LLMs on a legal benchmark dataset.

Advances in Legal Knowledge Retrieval and Modeling

Sources