Multilingual Text Analysis and Hate Speech Detection

The field of natural language processing is moving towards a more inclusive and diverse approach, with a focus on multilingual text analysis and hate speech detection. Researchers are working to develop more reliable evaluation pipelines for text style transfer and detoxification, as well as more effective methods for detecting hate speech in low-resource languages. A key challenge in this area is the need for more nuanced and culturally sensitive approaches to understanding and mitigating online harm, particularly in non-Western contexts. To address this, researchers are centering local disability experiences and cultural perspectives in the design and evaluation of AI systems. Notable papers in this area include: Evaluating Text Style Transfer: A Nine-Language Benchmark for Text Detoxification, which provides a comprehensive multilingual study on evaluation of text detoxification systems. Disability Across Cultures: A Human-Centered Audit of Ableism in Western and Indic LLMs, which reveals stark differences in the performance of Western and Indic language models in recognizing ableist harm. BIDWESH: A Bangla Regional Based Hate Speech Detection Dataset, which introduces a new dataset for hate speech detection in Bangla regional dialects.

Sources

Evaluating Text Style Transfer: A Nine-Language Benchmark for Text Detoxification

Disability Across Cultures: A Human-Centered Audit of Ableism in Western and Indic LLMs

BIDWESH: A Bangla Regional Based Hate Speech Detection Dataset

Built with on top of