Problem: Given a String S of N characters <code>(N <= 200 000)</code>, find the length of the longest substring that appears at least twice (the substrings can overlap). My solution: Here's what i've tried: <pre class="prettyprint"><code>int main() { std::string s; std::cin >> s; int max = 0; typedef std::string::const_iterator sit; sit end = s.end(); for(sit it1 = s.begin(); it1 != end; ++it1) for(sit it2 = it1 + 1; it2 != end; ++it2) max = std::max(max, std::mismatch(it1, it1 + (end - it2), it2).first - it1); std::cout << max; } </code></pre> Question: However, the above solution will get TLE as it runs in O(n^3). Is there any ways to improve it so it can run in O(n.logn)?

<blockquote> find the length of the longest substring that appears at least twice (the substrings can overlap) </blockquote> This problem is also commonly known as Longest repeated substring problem. It can be solved in linear time with a suffix tree. To solve this problem: <ol> <li>Add a special character '$' to the given string S),</li> <li>build a suffix tree from S ;</li> <li>the longest repeated substring of S is indicated by the deepest internal node in the suffix tree, where depth is measured by the number of characters traversed from the root.</li> </ol> Time complexity: <ul> <li>Suffix tree takes O(nlog(k))) time, where k is the size of the alphabet (if k is considered to be a constant, the asymptotic behaviour is linear)</li> <li>tree traversal(for finding longest repeated substring) can be done in O(n) time</li> </ul>

Longest substring that appears at least twice in O(n.logn)

Q: How do you find the longest repeating substring?

Longest Repeating Substring in C++ Suppose we have a string S, we have to find the length of the longest repeating substring(s). We will return 0 if no repeating substring is present. So if the string is like “abbaba”, then the output will be 2. As the longest repeating substring is “ab” or “ba”.

Q: How do you find the longest substring in a string without repeating characters?

Sliding WIndow Approach To check if a substring is present in another string can be done in O(N^2). Algorithm: Run a loop from i = 0 till N – 1 and consider a visited array. Run a nested loop from j = i + 1 to N – 1 and check whether the current character S[j] has already been visited.

Q: What is the longest common substring in a string?

The longest common substring is “abcdez” and is of length 6. Let m and n be the lengths of first and second strings respectively. A simple solution is to one by one consider all substrings of first string and for every substring check if it is a substring in second string.

Q: How to find the longest common substring in O (m*n) time?

Dynamic Programming can be used to find the longest common substring in O (m*n) time. The idea is to find length of the longest common suffix for all substrings of both strings and store these lengths in a table.

Q: What is the length of the longest non-repeating character substring?

The input string is geeksforgeeks The length of the longest non-repeating character substring is 7 Time Complexity: O (n + d) where n is length of the input string and d is number of characters in input string alphabet. For example, if string consists of lowercase English characters then value of d is 26. Auxiliary Space: O (d)

Q: How to find the longest common suffix of a string?

Dynamic Programming can be used to find the longest common substring in O(m*n) time. The idea is to find length of the longest common suffix for all substrings of both strings and store these lengths in a table. The longest common suffix has following optimal substructure property. If last characters match, then we reduce both lengths by 1

Tags:

c++

string

substring

algorithm

longest-substring

Problem:

Given a String S of N characters (N <= 200 000), find the length of the longest substring that appears at least twice (the substrings can overlap).

My solution:

Here's what i've tried:

int main()
{
    std::string s;
    std::cin >> s;
    int max = 0;
    typedef std::string::const_iterator sit;
    sit end = s.end();
    for(sit it1 = s.begin(); it1 != end; ++it1)
        for(sit it2 = it1 + 1; it2 != end; ++it2)
            max = std::max(max, std::mismatch(it1, it1 + (end - it2), it2).first - it1);
    std::cout << max;
}

Question:

However, the above solution will get TLE as it runs in O(n^3). Is there any ways to improve it so it can run in O(n.logn)?

862

asked Aug 28 '21 05:08

unglinh279

2 Answers

find the length of the longest substring that appears at least twice (the substrings can overlap)

This problem is also commonly known as Longest repeated substring problem. It can be solved in linear time with a suffix tree.

To solve this problem:

Add a special character '$' to the given string S),
build a suffix tree from S ;
the longest repeated substring of S is indicated by the deepest internal node in the suffix tree, where depth is measured by the number of characters traversed from the root.

Time complexity:

Suffix tree takes O(nlog(k))) time, where k is the size of the alphabet (if k is considered to be a constant, the asymptotic behaviour is linear)
tree traversal(for finding longest repeated substring) can be done in O(n) time

answered Nov 02 '22 10:11

prakash sellathurai

Suffix tree is an overkill for this problem. In fact, binary search suffices and the implementation is much easier.

Idea

The idea is simple: If there exists a duplicated substring of length N (N > 1), there must also exists one of length N - 1. Therefore, if we let f(x) to denote a function that returns true if there exists a duplicated substring of length x, f(x) will be a monotonic function, which allows a binary search solution.

Implementation

Binary search on the length of the longest duplicated substring and apply sliding windows to check if a given length is possible. Use string hashing to check for duplicate. Complexity: N log N

answered Nov 02 '22 09:11

Learning Mathematics

Related questions
                            
                                What is difference between "owned pointer" and the "stored pointer" for std::shared_ptr?
                            
                                Is there a way to loop over different data members of an object in C++
                            
                                CMake error - Target foo INTERFACE_SOURCES property contains path which is prefixed in the source directory
                            
                                How and where is it possible that a variable does not has an associated name in C++?
                            
                                Is there an operator precedence problem I'm missing? Compare of unsigned short with inverse fails
                            
                                How to measure elapsed time without being affected by changes in system time
                            
                                Shifting a vector in C++20
                            
                                What operations can make floats leave a [0, 1] range?
                            
                                Unexpected result of std::is_invocable over a template type
                            
                                Include ft2build.h in project on Linux
                            
                                C++20 ranges and sorting
                            
                                Use of `= default` allowing private constructor to be accessed
                            
                                Chrome V8 sample complie error, how can I solve ‘remove_cv_t’ is not a member of ‘std’?
                            
                                In C++20, is a macro considered "active" if it's #undef'd, then #define'd again?
                            
                                Do deduction guides require noexcept specifiers?
                            
                                Why does this code using designated initializers in function parameters goes from ambiguous to not compiling when removing one function?
                            
                                C++17 conditional (ternary) operator inconsistency between MSVC and Clang/GCC
                            
                                Structured binding violations
                            
                                Undefined behaviour on std::prev for transform-view
                            
                                Does designated initializer of sub-aggregate require curly braces?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With