I'm doing some text mining in web pages. Currently I'm working with Java, but maybe there is more appropriate languages to do what I want.
Example of some things I want to do:
Determine the char type of a word based on it parts (letter, digit, symbols, etc.) as Alphabetic, Number, Alphanumeric, Symbol, etc.(there is more types).
Discover stop words based on statistics.
Discover some gramatical class (verb, noun, preposition, conjuntion) based on statistics and some logics.
I was thinking about using Prolog and R (I don't know much about these languages), but I don't know if they are good for this or maybe, another language more appropriate.
Which can I use? Good libs for Java are welcome too.
python.! They have a HELL-LOTTA libraries in this area.
but, i've got no knowledge about prologue and R.. but definitely py is LOT better than java in text mining, and AI stuff...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With