Topic Modeling with Gensim LDA

Sean McHale
Team: Solo
Status: Complete
20 Mar 20236 min read
Tags:
Python
spaCy

This project applies topic modeling to posts from the LoseIt subreddit using LDA. SpaCy is used for text preprocessing, extracting key nouns, verbs, and adjectives, followed by filtering common terms to create a corpus. The optimal number of topics is determined by coherence scores, and results are visualized with pyLDAvis.