The field of online content moderation and hate speech detection is rapidly evolving, with a growing focus on multimodal approaches that incorporate text, image, and video analysis. Recent research has highlighted the importance of addressing the mental well-being of content moderators, who are often exposed to harmful and offensive content.
Innovative solutions, such as text-based content modification techniques and multimodal frameworks, have shown promise in improving the accuracy and robustness of hate speech detection systems. The use of large language models and transformer-based architectures has also been explored, with notable successes in detecting hate speech and understanding multimodal sarcasm.
Noteworthy papers in this area include: MMBERT, which proposes a novel BERT-based multimodal framework for robust Chinese hate speech detection, and ToxicTAGS, which introduces a first-of-its-kind dataset of real-world meme-based posts annotated with rich tag annotations, enhancing the context of each meme. Advancing Hate Speech Detection with Transformers, which evaluates multiple state-of-the-art transformer models for hate speech detection using the MetaHate dataset, achieving the highest performance with fine-tuned ELECTRA.