I'm starting to use the NLTK library, and I want to check whether a sentence in English is correct or not.
Example:
"He see Bob" - not correct
"He sees Bob" - correct
I read this, but it's quite hard for me. I need an easier example.
Grammar checking is an active area of NLP research, so there isn't a 100% answer (maybe not even an 80% answer) at this time. The simplest approach (or at least a reasonable baseline) would be an n-gram language model (normalizing LM probabilities for utterance length and setting a heuristic threshold for 'grammatical' or 'ungrammatical'.
You could use Google's n-gram corpus, or train your own on in-domain data. You might be able to do that with NLTK; you definitely could with LingPipe, the SRI Language Modeling Toolkit, or OpenGRM.
That said, an n-gram model won't perform all that well. If it meets your needs, great, but if you want to do better, you'll have to train a machine-learning classifier. A grammaticality classifier would generally use features from syntactic and/or semantic processing (e.g. POS-tags, dependency and constituency parses, etc.) You might look at some of the work from Joel Tetrault and the team he worked with at ETS, or Jennifer Foster and her team at Dublin.
Sorry there isn't an easy and straightforward answer...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With