Advances in Online Content Moderation and Social Media Governance

The field of online content moderation and social media governance is rapidly evolving, with a growing focus on developing innovative solutions to address the challenges of toxic content, misinformation, and partisan skew. Recent research has highlighted the importance of principled, consistent, contextual, proactive, transparent, and accountable content moderation, and has identified significant structural misalignments between corporate incentives and public interests. Studies have also explored the effectiveness of large language models in detecting subtle linguistic cues associated with seniority inflation and implicit expertise, as well as their potential to deliver acceptance and commitment therapy. Furthermore, research has investigated the impact of social media platforms' algorithmic systems on the topic, political skew, and reliability of information served to users, and has identified the need for ongoing study of these systems and their role in democratic processes. Noteworthy papers in this area include 'Content Moderation Futures', which examines the failures and possibilities of contemporary social media governance, and 'The Thinking Therapist', which investigates the impact of post-training methodology and explicit reasoning on the ability of large language models to deliver acceptance and commitment therapy. Additionally, 'The Role of Follow Networks and Twitter's Content Recommender' and 'TikTok Rewards Divisive Political Messaging' provide valuable insights into the role of social media platforms in shaping users' experiences and the potential for divisive political messaging to be rewarded.

Sources

Content Moderation Futures

Reading Between the Lines: Classifying Resume Seniority with Large Language Models

The Thinking Therapist: Training Large Language Models to Deliver Acceptance and Commitment Therapy using Supervised Fine-Tuning and Odds Ratio Policy Optimization

The Role of Follow Networks and Twitter's Content Recommender on Partisan Skew and Rumor Exposure during the 2022 U.S. Midterm Election

A Taxonomy of Response Strategies to Toxic Online Content: Evaluating the Evidence

Request a Note: How the Request Function Shapes X's Community Notes System

A Framework for AI-Supported Mediation in Community-based Online Collaboration

Incongruent Positivity: When Miscalibrated Positivity Undermines Online Supportive Conversations

TikTok Rewards Divisive Political Messaging During the 2025 German Federal Election

The Language of Approval: Identifying the Drivers of Positive Feedback Online

Mitigating Strategy Preference Bias in Emotional Support Conversation via Uncertainty Estimations

Towards Inclusive Toxic Content Moderation: Addressing Vulnerabilities to Adversarial Attacks in Toxicity Classifiers Tackling LLM-generated Content

Podcasts as a Medium for Participation in Collective Action: A Case Study of Black Lives Matter

Defining, Understanding, and Detecting Online Toxicity: Challenges and Machine Learning Approaches

Efficient Hate Speech Detection: Evaluating 38 Models from Traditional Methods to Transformers

Value Alignment of Social Media Ranking Algorithms

A Comparative Analysis of Transformer Models in Social Bot Detection

SMARTER: A Data-efficient Framework to Improve Toxicity Detection with Explanation via Self-augmenting Large Language Models

Built with on top of