NLP Analysis on Subreddit Polarization

Sean McHale

Team: 4 members

Status: Complete

30 Apr 2024 • 5 min read

Tags:

Python

Fine-tuning

Big Data

LangChain

NLTK

scikit-learn

This project analyzes online polarization surrounding the Israel-Palestine conflict by leveraging data from Reddit. LDA and BERTopic models were employed to categorize posts into key topics such as conflict violence and geopolitical discourse, followed by fine-tuning a Hugging Face XLNet model to classify polarization.