Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding typos in a list of words

Tags:

java

algorithm

Given a list of correct words all of length l and a list of incorrect words also of length l, find the words in the list of incorrect words that differ from the list of corrects words by a swap of two consecutive letters. These words are deemed typos. For example, hte is considered a typo of the while het is not considered a typos.

What is the optimally time efficient algorithm that allows us to find the list of words deemed typos by this definition ?

I was informed that the list may be computed in linear time, but I fail to find the a solution in linear time. The only method I can think of is to compare brute force all the letters from one list with the other.

like image 632
user1246462 Avatar asked Oct 06 '22 06:10

user1246462


1 Answers

L - list of correct words of size n.
L'- list of incorrect words of size m.
H - hash table
R - result set

1. for each word w in L :
   1.1  for each typo t of w : //there are l-1 typos
     1.1.1 insert t into H
2. for each word w' in L' :
   2.1 if w' in H insert w' in R
3. return R

Time complexity : O(n*l)+O(m) = O(max(n*l,m))

Space complexity : O(n*l)

like image 130
Grisha Weintraub Avatar answered Oct 11 '22 03:10

Grisha Weintraub