Advances in Graph Learning and Natural Language Processing for Social Media Analysis

The field of graph learning and natural language processing is moving towards more effective integration of large language models (LLMs) and graph neural networks (GNNs) to improve performance on node classification and other tasks. Researchers are exploring innovative methods to synergize the strengths of LLMs and GNNs, such as preference-driven knowledge distillation and node-aware fusion architectures. These approaches have shown significant gains in performance, particularly on heterophilous nodes and in zero-shot learning scenarios. Additionally, there is a growing interest in applying these techniques to social media analysis, including depression risk detection, suicidal ideation detection, and crisis news analysis. Noteworthy papers include:

  • Preference-driven Knowledge Distillation for Few-shot Node Classification, which proposes a framework to synergize LLMs and GNNs for few-shot node classification.
  • Glance for Context: Learning When to Leverage LLMs for Node-Aware GNN-LLM Fusion, which introduces a node-aware fusion architecture that adaptively invokes LLMs to refine GNN predictions.
  • CNSocialDepress: A Chinese Social Media Dataset for Depression Risk Detection and Structured Analysis, which releases a benchmark dataset for depression risk detection from Chinese social media posts.
  • Integrating Sequential and Relational Modeling for User Events: Datasets and Prediction Tasks, which proposes a unified formalization for modeling user events and releases a collection of datasets.
  • CrisisNews: A Dataset Mapping Two Decades of News Articles on Online Problematic Behavior at Scale, which presents a dataset of news articles covering social media-endemic crises.
  • Leveraging Language Semantics for Collaborative Filtering with TextGCN and TextGCN-MLP: Zero-Shot vs In-Domain Performance, which proposes TextGCN and TextGCN-MLP architectures for incorporating language semantics into recommender systems.
  • Suicidal Comment Tree Dataset: Enhancing Risk Assessment and Prediction Through Contextual Analysis, which constructs a high-quality annotated dataset for predicting users' suicidal risk levels.
  • Detecting Early and Implicit Suicidal Ideation via Longitudinal and Information Environment Signals on Social Media, which develops a computational framework for detecting early and implicit suicidal ideation on social media.

Sources

Preference-driven Knowledge Distillation for Few-shot Node Classification

Glance for Context: Learning When to Leverage LLMs for Node-Aware GNN-LLM Fusion

CNSocialDepress: A Chinese Social Media Dataset for Depression Risk Detection and Structured Analysis

Integrating Sequential and Relational Modeling for User Events: Datasets and Prediction Tasks

CrisisNews: A Dataset Mapping Two Decades of News Articles on Online Problematic Behavior at Scale

Leveraging Language Semantics for Collaborative Filtering with TextGCN and TextGCN-MLP: Zero-Shot vs In-Domain Performance

Suicidal Comment Tree Dataset: Enhancing Risk Assessment and Prediction Through Contextual Analysis

Detecting Early and Implicit Suicidal Ideation via Longitudinal and Information Environment Signals on Social Media

Built with on top of