I have been learning suffix arrays creation, & i understand that We first sort all suffixes according to first character, then according to first 2 characters, then first 4 characters and so on while the number of characters to be considered is smaller than 2n.
But my doubt is why don't we choose the first 3 characters, then 9... and so on. Why only 2 characters are taken into account since the strings are a part of same strings and not different random strings?
I haven't analyzed the suffix array construction algorithm thoroughly, but still would like to share my thoughts.
In my humble opinion, your question is similar to the following ones:
Why do computers use binary encoding of information instead of ternary?
Why does binary search bisect the range instead of trisecting it?
Why are there two sexes rather than three?
The reason is that the number 2 is special - it is the smallest plural number. The difference between 1 and 2 is qualitative, whereas the difference between 2 and 3 (as well as any other positive integer) is quantitative and therefore not as drastic.
As a result, binary formulation of many algorithms and data structures turns out to be the simplest one, though some of them may be generalized, with various degrees of added complexity, for an arbitrary base.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With