I need to categorize a text or word to a particular category. For example, the text 'Pink Floyd' should be categorized as 'music' or 'Wikimedia' as 'technology' or 'Einstein' as 'science'.
How can this be done? Is there a way I can use the DBpedia for the same? If not, the database has to be trained from time to time, right?
This is a text classification problem. Manning, Raghavan and Schütze's Information Retrieval book chapter is a nice introduction. I think you do not need DBPedia nor NER for this, just a small labeled training data set with enough labeled examples for all of your classes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With