A question for anyone who has used the Java library Mallet's SimpleTagger class for Conditional Random Fields (CRF). Assume that I'm already using the multi-thread option for the maximum number of CPUs I have available (this is the case): where would I start, and would kind of things should I try if I need it to run faster?
A related question is whether there is a way to do something similar to Stochastic Gradient Descent, which would speed up the training process?
The type of training I want to do is simple:
Input:
Feature1 ... FeatureN SequenceLabel
...
Test Data:
Feature1 ... FeatureN
...
Output:
Feature1 ... FeatureN SequenceLabel
...
(Where features are the output of processing I have done on the data in my own code.)
I've had problems getting any CRF classifier other than Mallet to approximately work, but I may have to backtrack again and revisit one of the other implementations, or try a new one.
Yes, stochastic gradient descent is usually way faster than the L-BFGS optimizer used in Mallet. I would suggest you try CRFSuite, which you can train either by SGD or L-BFGS. You could also give Léon Bottou's SGD-based implementation a try, but that is more difficult to setup.
Otherwise, I believe that CRF++ is the most used CRF software around. It is based on L-BFGS though, so it might not be fast enough for you.
Both CRFSuite and CRF++ should be easy to get started with.
Note that all of these will be slow if you have a large number of labels. At least CRFSuite can be configured to only take into account observed label-n-grams - in an (n-1)th order model - which will typically make training and prediction much faster.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With