Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

spell checker uses language model

I look for spell checker that could use language model.

I know there is a lot of good spell checkers such as Hunspell, however as I see it doesn't relate to context, so it only token-based spell checker.

for example,

I lick eating banana

so here at token-based level no misspellings at all, all words are correct, but there is no meaning in the sentence. However "smart" spell checker would recognize that "lick" is actually correctly written word, but may be the author meant "like" and then there is a meaning in the sentence.

I have a bunch of correctly written sentences in the specific domain, I want to train "smart" spell checker to recognize misspelling and to learn language model, such that it would recognize that even thought "lick" is written correctly, however the author meant "like".

I don't see that Hunspell has such feature, can you suggest any other spell checker, that could do so.

like image 330
user16168 Avatar asked Nov 10 '22 09:11

user16168


1 Answers

See "The Design of a Proofreading Software Service" by Raphael Mudge. He describes both the data sources (Wikipedia, blogs etc) and the algorithm (basically comparing probabilities) of his approach. The source of this system, After the Deadline, is available, but it's not actively maintained anymore.

like image 62
Daniel Naber Avatar answered Nov 22 '22 07:11

Daniel Naber