Advances in Explainable AI for Online Safety and Decision Support

The field of artificial intelligence is moving towards developing more transparent and explainable models, particularly in high-stakes applications such as online safety and decision support. Researchers are focusing on creating tools that can not only detect harmful content or make recommendations, but also provide clear explanations for their decisions. This shift towards explainability is driven by the need to build trust and understanding between humans and AI systems. Notable papers in this area include WATCHED, a chatbot designed to support content moderators in tackling hate speech, and DynaGuard, a dynamic guardian model that evaluates text based on user-defined policies. The Designing Effective AI Explanations for Misinformation Detection paper presents a comparative study of content, social, and combined explanations for AI-driven misinformation detection. Other noteworthy papers are Towards Personalized Explanations for Health Simulations, which presents a framework for generating tailored explanations of health simulations, and TAXAL, a triadic fusion framework for explainable Large Language Models. REMI, a novel causal schema memory architecture, is also a significant contribution to the field of personalized lifestyle recommendation agents. Lastly, the Explained, yet misunderstood paper highlights the importance of AI literacy in interpreting explainable AI elements, and Interpretability as Alignment argues that interpretability should be treated as a design principle for alignment in AI research and development.

Sources

WATCHED: A Web AI Agent Tool for Combating Hate Speech by Expanding Data

DynaGuard: A Dynamic Guardrail Model With User-Defined Policies

Designing Effective AI Explanations for Misinformation Detection: A Comparative Study of Content, Social, and Combined Explanations

Towards Personalized Explanations for Health Simulations: A Mixed-Methods Framework for Stakeholder-Centric Summarization

Triadic Fusion of Cognitive, Functional, and Causal Dimensions for Explainable LLMs: The TAXAL Framework

REMI: A Novel Causal Schema Memory Architecture for Personalized Lifestyle Recommendation Agents

Explained, yet misunderstood: How AI Literacy shapes HR Managers' interpretation of User Interfaces in Recruiting Recommender Systems

Interpretability as Alignment: Making Internal Understanding a Design Principle

Built with on top of