Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to implement python spell checker using google's "did you mean?"

Tags:

python

api

I'm looking for a way to make a function in python where you pass in a string and it returns whether it's spelled correctly. I don't want to check against a dictionary. Instead, I want it to check Google's spelling suggestions. That way, celebrity names and other various proper nouns will count as being spelled correctly.

Here's where I'm at so far. It works most of the time, but it messes up with some celebrity names. For example, things like "cee lo green" or "posner" get marked as incorrect.

import httplib
import xml.dom.minidom

data = """
<spellrequest textalreadyclipped="0" ignoredups="0" ignoredigits="1" ignoreallcaps="1">
<text> %s </text>
</spellrequest>
"""

def spellCheck(word_to_spell):

    con = httplib.HTTPSConnection("www.google.com")
    con.request("POST", "/tbproxy/spell?lang=en", data % word_to_spell)
    response = con.getresponse()

    dom = xml.dom.minidom.parseString(response.read())
    dom_data = dom.getElementsByTagName('spellresult')[0]

    if dom_data.childNodes:
        for child_node in dom_data.childNodes:
            result = child_node.firstChild.data.split()
        for word in result:
            if word_to_spell.upper() == word.upper():
                return True;
        return False;
    else:
        return True;
like image 652
Sean Gransee Avatar asked Dec 08 '11 09:12

Sean Gransee


2 Answers

Peter Norvig tells you how implement spell checker in Python.

like image 157
duffymo Avatar answered Oct 11 '22 12:10

duffymo


Rather than sticking to Mr. Google, try out other big fellows.

  1. If you really want to stick with search engines which count page requests, Yahoo and Bing are providing some excellent features. Yahoo is directly providing spell checking services using YQL tables (Free: 5000 request/day and non-commercial).

  2. You have good number of Python API's which are capable to do a lot similar magic including on nouns that you mentioned (sometimes may turn around - after all its somewhere based upon probability)

So, in the second case, you got a good list (totally free)

  1. GNU - Aspell (Even got python bindings)
  2. PyEnchant
  3. Whoosh (It does a lot more than spell checking but I think it has some edge on it.)

I hope they should give you a clear idea of how things work.

Actually spell checking involves very complex mechanisms in the areas of Machine learning, AI, NLP.. etc a lot more. So, companies like Google/ Yahoo don't really offer their API entirely free.

like image 22
Surya Avatar answered Oct 11 '22 14:10

Surya