Gtopics - Inferencing human oriented topics from data

Koustuv Sinha, Derek Ruths and David Jurgens

When training a topic based document classification system, how do we infer which topics to train? Traditionally, researchers have used unsupervised topic models such as LDA to generate clusters and then tag meaningful topics. Our work, however is to prove that meaningful topics can be inferred from social media discourse itself, which provides an easy way to just train your classifier on the topics inferred by our pipeline.