Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find all (english word) substrings of a given string

This is an interview question: Find all (english word) substrings of a given string. (every = every, ever, very).

Obviously, we can loop over all substrings and check each one against an English dictionary, organized as a set. I believe the dictionary is small enough to fit the RAM. How to organize the dictionary ? As for as I remember, the original spell command loaded the words file in a bitmap, represented a set of words hash values. I would start from that.

Another solution is a trie built from the dictionary. Using the trie we can loop over all string characters and check the trie for each character. I guess the complexity of this solution would be the same in the worst case (O(n^2))

Does it make sense? Would you suggest other solutions?

like image 440
Michael Avatar asked Mar 02 '11 18:03

Michael


People also ask

How do you find all substrings in length K?

If the length of a string is N, then there can be N – K + 1 substring of length K. Generating these substrings will require O(N) complexity, and checking each substring requires O(K) complexity, hence making the overall complexity like O(N*K). Efficient approach: The idea is to use Window Sliding Technique.

How to check if a string is substring of another string in C?

The function strstr returns the first occurrence of a string in another string. This means that strstr can be used to detect whether a string contains another string. In other words, whether a string is a substring of another string.

What is substring of a string?

A substring is a subset or part of another string, or it is a contiguous sequence of characters within a string.


1 Answers

The Aho-Corasick string matching algorithm which "constructs a finite state machine that resembles a trie with additional links between the various internal nodes."
But everything considered the "build a trie from the English dictionary and do a simultaneous search on it for all suffixes of the given string" should be pretty good for an interview.

like image 124
Eugen Constantin Dinca Avatar answered Sep 21 '22 13:09

Eugen Constantin Dinca