Given a string s
, what is the fastest method to generate a set of all its unique substrings?
Example: for str = "aba"
we would get substrs={"a", "b", "ab", "ba", "aba"}
.
The naive algorithm would be to traverse the entire string generating substrings in length 1..n
in each iteration, yielding an O(n^2)
upper bound.
Is a better bound possible?
(this is technically homework, so pointers-only are welcome as well)
Find the position of all matches for the string using the regex \w(? =\w\w) . This will give you the start index of the first character of each required sub-string. In this case, you would get: 0 , 1 , 2 , 3 , 4 , 8 , 9 , 10 and 11 .
As other posters have said, there are potentially O(n^2) substrings for a given string, so printing them out cannot be done faster than that. However there exists an efficient representation of the set that can be constructed in linear time: the suffix tree.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With