The field of hate speech detection and mitigation is moving towards more nuanced and context-aware approaches. Researchers are exploring the use of reinforcement learning, modular deep learning frameworks, and culture-aware frameworks to improve detection accuracy and address the complexities of hate speech. Noteworthy papers in this area include: RV-HATE, which introduces a detection framework that adapts to dataset-specific characteristics and provides interpretable insights into the distinctive features of each dataset. Unpacking Hateful Memes, which develops a framework that captures the fundamental nature of hate by modeling presupposed context and detecting false claims. Seeing Hate Differently, which proposes a culture-aware framework that constructs individuals' hate subspaces to address data sparsity, cultural entanglement, and ambiguous labeling.
Hate Speech Detection and Mitigation
Sources
A Multi-Component Reward Function with Policy Gradient for Automated Feature Selection with Dynamic Regularization and Bias Mitigation
PromptGuard at BLP-2025 Task 1: A Few-Shot Classification Framework Using Majority Voting and Keyword Similarity for Bengali Hate Speech Detection