I was solving DISTINCT SUBSTRING (given a string, we need to find the total number of its distinct substrings).
I am using trie of suffixes to solve it.
I am passing the test cases, but getting TLE
when I submit. Also, the space consumed is very large, at 4093M
.
Note: Since there can be 256 char in total, I am setting an array size of 257, and the ascii value is acting as the index.
What I am thinking now:
for(int i=0;i<p;i++){
string temp = str.substr(i,p-i);
insert1(root,temp);
}
Since substr()
may take O(n) time, in the worst case insert function also takes
(n) time in the worst case, and O(n) for loop: O(n^3). This is getting me TLE.
error: could not convert 'temp' from 'std::__cxx11::string* {aka std::__cxx11::basic_string*}' to 'std::__cxx11::string {aka std::__cxx11::basic_string}'| ||=== Build failed: 2 error(s), 0 warning(s) (0 minute(s), 0 second(s)) ===|
So I am thinking to replace substr()
by something like this:
for(int i=0;i<p;i++){
string *temp = &str[i];
insert1(root,temp);
} ///and it is giving me error please suggest here what is the mistake and what to do
So that I can save O(n) time.
Please tell me how can I modify my trie approach so it gets accepted.
#include<iostream>
#include<string>
using namespace std;
const int alphabetsize = 257;
int cnt=0;
struct trienode{
struct trienode* children[alphabetsize];
bool isendofword;
};
struct trienode *getnode(void){
struct trienode *newnode = new trienode;
newnode->isendofword = false;
for(int i=0;i<alphabetsize;i++){
newnode->children[i]=NULL;
}
return newnode;
}
void insert1(struct trienode* root,string &key){
struct trienode *temp = root;
for(int i=0;i<key.length();i++){
int index = key[i];
if(!temp->children[index]){
temp->children[index]=getnode();
cnt++;
}
temp = temp->children[index];
}
temp->isendofword=true;
}
int main(){
int t;
cin>>t;
while(t--){
cnt=0;
string str;
cin>>str;
int p = str.length();
struct trienode* root = getnode();
for(int i=0;i<p;i++){
string temp = str.substr(i,p-i);
insert1(root,temp);
}
cout<<cnt<<endl;
}
}
Number of substrings of a string of length n is (n * (n + 1) / 2).
The simplest approach is to generate all possible substrings from the given string and check for each substring whether it contains all distinct characters or not. If the length of string is N, then there will be N*(N+1)/2 possible substrings.
Basically you use itertools to first find all combinations, then set to find unique elements, then cast to list. Edit for a clearer explanation: First, use combinations to get all pair of indexes corresponding to substrings. The trick here is that itertools.
I'm not experienced in C++ but the error you commented about seems like it might stem from different expectations by the compiler for the type of variable it's receiving when encountering the variable temp
.
As others have noted in SPOJ and in the comments, since the input length is only 256 characters, you could possibly get away with brute-force hashing and counting all of the substring occurences.
Another option is to examine the longest common prefixes in the suffix array for the string, both of which there are known construction algorithms for. If we iterate from the end of the suffix array, the difference between the current suffix length and the longest common prefix with its neighbour to the right tells us how many new distinct substrings are introduced.
For example:
01234
CCCCC
suffix array:
01234
43210
CCCCC
CCCC
CCC
CC
C
i: 4 -> 5 new substrings
i: 3 -> lcp(3,4) = len(3) no new substrings
i: 2 -> lcp(2,3) = len(2) no new substrings
i: 1 -> lcp(1,2) = len(1) no new substrings
i: 0 -> lcp(0,1) = len(0) no new substrings
total: 5 distinct substrings
Second example:
01234
ABABA
suffix array:
01234
42031
AAABB
BBAA
AA B
B A
A
i: 4 -> 4 new substrings
i: 3 -> lcp(3,4) = len(3) no new substrings
i: 2 -> 5 new substrings
i: 1 -> lcp(1,2) = len(1) no new substrings
i: 0 -> lcp(0,1) = len(0) no new substrings
total: 9 distinct substrings
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With