I'm looking for a good open source POS Tagger in Java. Here's what I have come up with so far. <ul> <li>LingPipe</li> <li>Stanford</li> <li>LBJ</li> <li>FastTag</li> </ul> Anybody got any recommendations?

Are you looking to tag POS in a specific domain? Most of the general purpose taggers are trained on newswire text. Typically they don't perform well when you are using them in specific domains (such and biomedical text). There are other taggers specifically trained for such domains such as dTagger (java) for biomedical text. For newswire text, Adwait Ratnaparkhi's MXPOST is very good and is the one I would recommend. Other Java implementations include: <ol> <li>MontyLingua</li> <li> Berkeley Parser (Not really a POS tagger but all full blown parsers will typically include POS taggers. Google for Java syntactic parsers and you will find many.)</li> <li>QTag</li> <li>LBJ</li> </ol> OpenNLP and Lingpipe as posted by the other posters are also pretty decent. Info on the state-of-the-art on POS tagging can be found here. As you can see LTAG-Spinal (also mentioned by another poster) ranks best as of now, but the variation across the various taggers is not much. I have not used LTAG myself. Also note that the baseline performance for POS tagging is about 90%. Baseline means - (a) tag every word by most frequent POS tag from a lexicon, and (b) tag every unknown word as a noun.

What is a good Java library for Parts-Of-Speech tagging? [closed]

2 Answers

Are you looking to tag POS in a specific domain? Most of the general purpose taggers are trained on newswire text. Typically they don't perform well when you are using them in specific domains (such and biomedical text). There are other taggers specifically trained for such domains such as dTagger (java) for biomedical text.

For newswire text, Adwait Ratnaparkhi's MXPOST is very good and is the one I would recommend.

Other Java implementations include:

MontyLingua
Berkeley Parser (Not really a POS tagger but all full blown parsers will typically include POS taggers. Google for Java syntactic parsers and you will find many.)
QTag
LBJ

OpenNLP and Lingpipe as posted by the other posters are also pretty decent.

Info on the state-of-the-art on POS tagging can be found here. As you can see LTAG-Spinal (also mentioned by another poster) ranks best as of now, but the variation across the various taggers is not much. I have not used LTAG myself.

Also note that the baseline performance for POS tagging is about 90%. Baseline means - (a) tag every word by most frequent POS tag from a lexicon, and (b) tag every unknown word as a noun.

145

answered Sep 18 '22 12:09

hashable

I have used OpenNLP with good results. You can also check out MorphAdorner.

answered Sep 19 '22 12:09

Shashikant Kore

Related questions
                            
                                Is iteration via Collections.synchronizedSet(...).forEach() guaranteed to be thread safe?
                            
                                Spring ThreadPoolTaskExecutor vs Java Executorservice cachedthreadpool [closed]
                            
                                Embedding resources (images, sound bits, etc) into a Java project then use those resources
                            
                                Compiler complains about "missing return statement" even though it is impossible to reach condition where return statement would be missing
                            
                                Default methods and interfaces extending other interfaces
                            
                                Slow application, frequent JVM hangs with single-CPU setups and Java 12+
                            
                                Java REST client API for Android [closed]
                            
                                Dealing with video (DVDs, .avi .mkv) in Java
                            
                                Docker: Combine multiple images
                            
                                Stateless Session Beans vs. Singleton Session Beans
                            
                                Is there an interactive interpreter for Java? [closed]
                            
                                Spring Data MongoDB: how to implement "entity relationships"?
                            
                                Does Stream.forEach respect the encounter order of sequential streams?
                            
                                Where to find Java 6 JSSE/JCE Source Code?
                            
                                How to get maven to timeout earlier while downloading dependencies?
                            
                                Need memory efficient way to store tons of strings (was: HAT-Trie implementation in java)
                            
                                Java GPU programming [closed]
                            
                                Java variable placed on stack or heap
                            
                                How to handle database migrations in Spring Boot with Hibernate?
                            
                                Creating Java Web Service using Google AppEngine

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is a good Java library for Parts-Of-Speech tagging? [closed]

Tags:

java

nlp

Glenn

People also ask

2 Answers

hashable

Shashikant Kore

Recent Activity

Donate For Us