I am planning an application which will make clusters of short messages/tweets based on topics. The number of topics will be limited like Sports [ NBA, NFL, Cricket, Soccer ], Entertainment [ movies, music ] and so on...
I can think of two approaches to this
I would like to know whether there are any other approaches to this problem. Or are there any ways of improving the above mentioned methods?
Also suggest some good clustering algorithms.I think "K-Nearest Clustering" algorithm is apt for this situation.
Check out Carrot2, this tool extracts the tags from the text and clusters. You can download it from here and check the algorithms implemented (Lingo, mainly) here.
Hope this help you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With