Hate Speech Detection and Mitigation

The field of hate speech detection and mitigation is moving towards more nuanced and context-aware approaches. Researchers are exploring the use of reinforcement learning, modular deep learning frameworks, and culture-aware frameworks to improve detection accuracy and address the complexities of hate speech. Noteworthy papers in this area include: RV-HATE, which introduces a detection framework that adapts to dataset-specific characteristics and provides interpretable insights into the distinctive features of each dataset. Unpacking Hateful Memes, which develops a framework that captures the fundamental nature of hate by modeling presupposed context and detecting false claims. Seeing Hate Differently, which proposes a culture-aware framework that constructs individuals' hate subspaces to address data sparsity, cultural entanglement, and ambiguous labeling.

Sources

A Multi-Component Reward Function with Policy Gradient for Automated Feature Selection with Dynamic Regularization and Bias Mitigation

PromptGuard at BLP-2025 Task 1: A Few-Shot Classification Framework Using Majority Voting and Keyword Similarity for Bengali Hate Speech Detection

Unpacking Hateful Memes: Presupposed Context and False Claims

Sarcasm Detection Using Deep Convolutional Neural Networks: A Modular Deep Learning Framework

RV-HATE: Reinforced Multi-Module Voting for Implicit Hate Speech Detection

Bridging Gaps in Hate Speech Detection: Meta-Collections and Benchmarks for Low-Resource Iberian Languages

Evaluating Open-Source Vision-Language Models for Multimodal Sarcasm Detection

Hypernetworks for Perspectivist Adaptation

Seeing Hate Differently: Hate Subspace Modeling for Culture-Aware Hate Speech Detection