Statistical Guarantees in AI Model Training and Evaluation

The field of AI research is moving towards developing methods that provide statistical guarantees for model training and evaluation. This includes identifying training data with provable false discovery rate control, constructing prediction sets with constrained miscoverage rates, and selecting instances where AI predictions can be trusted. These advances aim to ensure the trustworthiness and reliability of AI models, particularly in risk-sensitive applications. Notable papers in this area include:

  • High-Power Training Data Identification with Provable Statistical Guarantees, which introduces a rigorous method for identifying training data with strict false discovery rate control.
  • SAFER: Risk-Constrained Sample-then-Filter in Large Language Models, which presents a two-stage risk control framework for ensuring the trustworthiness of AI model outputs.
  • Selective Labeling with False Discovery Rate Control, which proposes a method to identify instances where AI predictions can be provably trusted by controlling the false discovery rate.

Sources

High-Power Training Data Identification with Provable Statistical Guarantees

SAFER: Risk-Constrained Sample-then-Filter in Large Language Models

Reverse Supervision at Scale: Exponential Search Meets the Economics of Annotation

Human Uncertainty-Aware Data Selection and Automatic Labeling in Visual Question Answering

Selective Labeling with False Discovery Rate Control

Built with on top of