Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm for finding first repeated substring of length k

There is a homework I should do and I need help. I should write a program to find the first substring of length k that is repeated in the string at least twice.

For example in the string "banana" there are two repeated substrings of length 2: "an" , "na". In this case, the answer is "an" because it appeared sooner than "na"

Note that the simple O(n^2) algorithm is not useful since there is time limit on execution time of program so I guess it should be in linear time.

There is a hint too: Use Hash table.

I don't want the code. I just want you to give me a clue because I have no idea how to do this using a hash table. Should I use a specific data structure too?

like image 965
Omid Avatar asked May 14 '11 22:05

Omid


1 Answers

Iterate over the character indexes of the string (0, 1, 2, ...) up to and including the index of the second-from-last character (i.e. up to strlen(str) - 2). For each iteration, do the following...

Extract the 2-char substring starting at the character index.

Check whether your hashtable contains the 2-char substring. If it does, you've got your answer.

Insert each 2-char substring into the hashtable.

This is easily modifiable to cope with substrings of length k.

like image 173
Will A Avatar answered Sep 25 '22 13:09

Will A