Spell check algorithm outputs everything instead of just the typos (Python)?

Question

I'm basically trying to code a simple spell-check program that will prompt you for an input file, then analyze the input file for possible spelling errors (by using binary search to see if the word is in the dictionary), before printing them in the output file. However, currently, it outputs everything in the input file instead of just the errors... My code is as follows:

import re

with open('DICTIONARY1.txt', 'r') as file:
    content = file.readlines()
    dictionary = []
    for line in content:
        line = line.rstrip()
        dictionary.append(line)

def binary_search(array, target, low, high):
    mid = (low + high) // 2
    if low > high:
        return -1
    elif array[mid] == target:
        return mid
    elif target < array[mid]:
        return binary_search(array, target, low, mid-1)
    else:
        return binary_search(array, target, mid+1, high)

input = input("Please enter file name of file to be analyzed: ")
infile = open(input, 'r')
contents = infile.readlines()
text = []
for line in contents:
    for word in line.split():
        word = re.sub('[^a-z\ \']+', " ", word.lower())
        text.append(word)
infile.close()
outfile = open('TYPO.txt', 'w')
for data in text:
    if data.strip() == '':
        pass
    elif binary_search(dictionary, data, 0, len(data)) == -1:
        outfile.write(data + "
")
    else:
        pass

file.close
outfile.close

I can't seem to figure out what's wrong. :( Any help would be very much appreciated! Thank you. :)

Abd Azrad · Accepted Answer

I tried replacing len(data) with len(dictionary) as that made more sense to me and it seems to work in my very limited tests.

I think you were passing the length of the word in question as the upper bound on the dictionary. So if you were looking up the word "dog" you were only checking the first 3 words in the dictionary, and since your dictionary is probably very large, almost every word was never found (so every word was in the output file).

Spell check algorithm outputs everything instead of just the typos (Python)?

Tags:

python

Alaete

1 Answers

Abd Azrad

Recent Activity

Donate For Us

Spell check algorithm outputs everything instead of just the typos (Python)?

Tags:

python

Alaete

1 Answers

Abd Azrad

Related questions

Recent Activity

Donate For Us