Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Text classification using Java

I need to categorize a text or word to a particular category. For example, the text 'Pink Floyd' should be categorized as 'music' or 'Wikimedia' as 'technology' or 'Einstein' as 'science'.

How can this be done? Is there a way I can use the DBpedia for the same? If not, the database has to be trained from time to time, right?

like image 479
madCode Avatar asked Jan 20 '23 20:01

madCode


1 Answers

This is a text classification problem. Manning, Raghavan and Schütze's Information Retrieval book chapter is a nice introduction. I think you do not need DBPedia nor NER for this, just a small labeled training data set with enough labeled examples for all of your classes.

like image 79
Yuval F Avatar answered Feb 02 '23 08:02

Yuval F