I would like to reduce the complexity of the following algorithm. Basically, it takes a word as an input and calculates the number of unique letters within it (the "entropy" of the word). My current solution employs 3 embedded for loops, which comes out to a complexity of o(n^3). Since this code is part of a bigger project (we built a solver for the game known as boggle), I was hoping to reduce the complexity of my algorithm in order to reduce its execution time. Thanks in advance!
int wordEntropy(string word)
{
int length = word.length();
int uniquewords = length;
string compare = word;
char save[17];
int cond=0;
for (int ii=0; ii < length; ii++)
{
for (int jj=ii+1; jj < length; jj++)
{
for (int kk=0; kk<= ii; kk++)
{
if (save[kk] == word[ii]) {cond++;}
}
if (word[ii] == word[jj])
{
if (cond>0) {break;}
uniquewords--;
}
}
save[ii] = word[ii];
cond = 0;
}
return uniquewords;
}
Putting all together. Continuing on the challenge to reduce the iterations on our algorithm, we have to perform the following steps: build the "index" with the information to be accessed later. iterate over the loop and fetch the additional information from the previously created "index"
While writing any industry level algorithm/code or even in competitive programming one must keep in mind the complexity of it. In order to make anything scalable we must need to optimize our code for large data as much as possible. Scalability & optimization are directly related.
You can easily omit declaration of perfect squares, count and total_length, as they aren't needed, as explained further. This will reduce both Time and Space complexities of your code. Also, you can use Fast IO, in order to speed up INPUTS and OUTPUTS This is done by using 'stdin. readline', and 'stdout.
One cheap solution is just to stick the characters in an unordered_set
, which is a hashset (amortized O(1) insertion and lookup):
#include <unordered_set>
int wordEntropy(const std::string &word) {
std::unordered_set<char> uniquechars(word.begin(), word.end());
return uniquechars.size();
}
This yields a complexity of O(n), which is as good as it gets.
Do the computation in place, without any extra (and time-consuming) memory allocations:
std::sort(word.begin(), word.end());
auto last = std::unique(word.begin(), word.end());
return last - word.begin();
If this is really about performance, depending on the range of valid characters something like this may be faster:
std::size_t wordEntropy( const std::string & word )
{
unsigned char seen[256] = { 0 };
for( unsigned char c : word )
{
++seen[ c ];
}
return std::count_if( & seen[0], & seen[ 0 ] + 256,
[]( unsigned char c ) { return c != 0; } );
}
But obviously, this is a little bit harder to maintain. This solution has guaranteed complexity of O(n) and it does not make any dynamic memory allocations.
Alternative version that does not have problems if a character occurs more than 255 times:
std::size_t wordEntropy( const std::string & word )
{
bool seen[256] = { false };
for( unsigned char c : word )
{
seen[ c ] = true;
}
return std::count_if( & seen[0], & seen[ 0 ] + 256,
[]( bool t ) { return t; } );
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With