Suicidal Ideation Detection Using Language Models

Transformer-based NLP system for early detection of suicidal ideation in Reddit posts.


Overview

This research addresses a critical mental health challenge by developing an advanced NLP system for early detection of suicidal ideation in social media posts. Published at ICCA 2024 as a first-author paper, this work combines state-of-the-art language models with Bidirectional GRU networks to achieve remarkable detection accuracy while minimizing false negatives—crucial for potentially life-saving interventions.

The system analyzes Reddit posts to identify linguistic patterns and emotional markers associated with suicidal ideation, providing a scalable solution for mental health monitoring in online communities.

Tools & Technologies

Python PyTorch Transformers BERT RoBERTa DistilBERT ELECTRA Bi-GRU scikit-learn NLTK Pandas


Research Motivation

Suicide is a leading cause of death worldwide, with social media platforms increasingly becoming spaces where individuals express distressing thoughts. However, detecting these signals faces several challenges:

  • Implicit Expression: People often express suicidal thoughts indirectly due to stigma
  • Complex Language: Sarcasm, metaphors, and context-dependent meanings complicate detection
  • High Stakes: False negatives could mean missing someone in crisis
  • Scale Challenge: Manual monitoring of millions of posts is impossible

Our approach leverages deep learning to provide automated, scalable detection while maintaining high sensitivity to potential crisis situations.


Technical Innovation

1. Hybrid Architecture

Our novel approach combines the strengths of two powerful techniques:

Pre-trained Language Models

Leverage contextual understanding from BERT, RoBERTa, DistilBERT, DistilRoBERTa, and ELECTRA-Small for deep semantic comprehension

Bidirectional GRU

Capture sequential dependencies and temporal patterns in both forward and backward directions for enhanced text understanding

2. Model Architecture

Input Text  Tokenization  Pre-trained Language Model  
Contextual Embeddings  Bidirectional GRU  
Dense Layer  Sigmoid  Binary Classification

3. Key Features

  • Contextual Understanding: Language models capture nuanced meanings and context
  • Sequential Pattern Recognition: Bi-GRU identifies temporal dependencies in text
  • Attention to Subtle Cues: Model learns to recognize indirect expressions of distress
  • Low False Negative Rate: Optimized to minimize missing at-risk individuals

Results & Performance

95.8%

BERT-BiGRU Accuracy

95.2%

DistilBERT-BiGRU Accuracy

4.17%

BERT False Negative Rate

2.80%

DistilBERT False Negative Rate

Model Performance Comparison

Model Accuracy Precision Recall F1-Score False Negative Rate
BERT-BiGRU 95.8% 94.2% 97.1% 95.6% 4.17%
DistilBERT-BiGRU 95.2% 93.8% 96.5% 95.1% 2.80%
RoBERTa-BiGRU 94.6% 93.1% 96.0% 94.5% 5.23%
ELECTRA-BiGRU 93.9% 92.4% 95.3% 93.8% 6.15%

Ethical Considerations

This research was conducted with careful attention to:

  • Privacy: All data was anonymized and no personal identifiers were retained
  • Consent: Used only publicly available data from consenting platforms
  • Responsible Disclosure: Results shared with mental health organizations
  • Deployment Ethics: Emphasis on human-in-the-loop systems

Real-World Impact

Potential Applications

  • Social Media Monitoring: Early warning systems for platforms
  • Mental Health Support: Assisting counselors in identifying at-risk individuals
  • Research Tool: Understanding linguistic markers of mental distress
  • Crisis Prevention: Enabling timely interventions

Key Advantages

  • Scalability: Can process millions of posts in real-time
  • Consistency: Eliminates human variability in screening
  • 24/7 Availability: Continuous monitoring capability
  • Multi-platform: Adaptable to various social media platforms

Citation

If you use this work in your research, please cite:

@inproceedings{shuvo2024early,
  title={Early Detection of Suicidal Ideation Using Bidirectional GRU 
         and Language Models},
  author={Shuvo, Shakil Mahmud and Novely, Navia and Faruk, Md Farukuzzaman 
          and Srizon, Azmain Yakin and Hasan, SM Mahedy},
  booktitle={Proceedings of the 3rd International Conference on Computing Advancements},
  pages={482--490},
  year={2024}
}