Advances in Natural Language Processing for Social Media Analysis

The field of natural language processing is moving towards more nuanced and context-aware analysis of social media content. Researchers are developing innovative methods to detect and mitigate harmful behaviors such as hate speech, offensive language, and child-targeted hate speech. The introduction of specialized datasets, such as ChildGuard and ANUBHUTI, is filling critical gaps in resources for low-resource languages and dialects. Additionally, the use of transfer learning and large language models is improving the accuracy and effectiveness of natural language processing systems. Noteworthy papers include:

  • Hope Speech Detection in code-mixed Roman Urdu tweets, which introduces a carefully annotated dataset and a custom attention-based transformer model for hope speech detection.
  • Leveraging the Potential of Prompt Engineering for Hate Speech Detection in Low-Resource Languages, which pioneers the use of metaphor prompting to circumvent built-in safety mechanisms of large language models.

Sources

Hope Speech Detection in code-mixed Roman Urdu tweets: A Positive Turn in Natural Language Processing

ChildGuard: A Specialized Dataset for Combatting Child-Targeted Hate Speech

ANUBHUTI: A Comprehensive Corpus For Sentiment Analysis In Bangla Regional Languages

Offensive Language Detection on Social Media Using XLNet

PromptAug: Fine-grained Conflict Classification Using Data Augmentation

Data Augmentation for Cognitive Behavioral Therapy: Leveraging ERNIE Language Models using Artificial Intelligence

Leveraging a Multi-Agent LLM-Based System to Educate Teachers in Hate Incidents Management

Leveraging the Potential of Prompt Engineering for Hate Speech Detection in Low-Resource Languages

Built with on top of