Question

Currently I am using LDA to apply topic modeling to a corpus. Since LDA is unsupervised, it returns a set of words for a given 'topic' but doesn't necessarily specify the topic itself. I was wondering if there are any suggestions for algorithms that take a list of words and sees what topics it can be categorized to?

For example [cat, dog, fish] can be categorized to animals or pets.

One output for my model:

['game', 'week', 'fantasy', 'sportsline', 'play', 'players', 'league', 'random', 'sunday', 'season', 'agent', 'elink', 'exercise', 'start', 'yards', 'free', 'injury', 'expected', 'practice', 'getbad', 'weekly', 'year', 'reports', 'starting', 'luck', 'nat', 'nfl', 'weeks', 'smith', 'fast']

Could be categorized to football or sports.

Any suggestions, specifically with Python models/packages would be much appreciated.

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top