Recently,i have read about the "discriminative reranking for natural language processing" by Collins. I'm confused what does the reranking actually do? Add more global features to the rerank model? or something else?
If you mean this paper, then what is done is the following:
The reason why the second model is useful is that in generative models (such as naïve Bayes, HMMs, PCFGs), it can be hard to add features other than word identity, because the model would try to predict the probability of the exact feature vector instead of the separate features, which might not have occurred in the training data and will have P(vector|tree) = 0 and therefore P(tree|vector) = 0 (+ smoothing, but the problem remains). This is the eternal NLP problem of data sparsity: you can't build a training corpus that contains every single utterance that you'll want to handle.
Discriminative models such as MaxEnt are much better at handling feature vectors, but take longer to fit and can be more complicated to handle (although CRFs and neural nets have been used to construct parsers as discriminative models). Collins et al. try to find a middle ground between the fully generative and fully discriminative approaches.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With