Topic Modeling with Gensim LDA

Sean McHale

Team: Solo

Status: Complete

20 Mar 2023 • 6 min read

Tags:

Python

spaCy

This project applies topic modeling to posts from the LoseIt subreddit using LDA. SpaCy is used for text preprocessing, extracting key nouns, verbs, and adjectives, followed by filtering common terms to create a corpus. The optimal number of topics is determined by coherence scores, and results are visualized with pyLDAvis.