Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

lexicographically smallest string after rotation

I am trying to solve this problem in spoj

I need to find the number of rotations of a given string that will make it lexicographically smallest among all the rotations.

For example:

Original: ama

First rotation: maa

Second rotation: aam This is the lexicographically smallest rotation so the answer is 2.

Here's my code:

string s,tmp;
    char ss[100002];
    scanf("%s",ss);
    s=ss;
    tmp=s;
    int i,len=s.size(),ans=0,t=0;
    for(i=0;i<len;i++)
    {
        string x=s.substr(i,len-i)+s.substr(0,i);
        if(x<tmp)
        {
            tmp=x;
            t=ans;
        }
        ans++;
    }

    cout<<t<<endl;

I am getting "Time Limit Exceeded" for this solution. I don't understand what optimizations can be made. How can I increase the speed of my solution?

like image 556
doctore Avatar asked Feb 21 '13 12:02

doctore


People also ask

How do you find the lexicographically smallest character?

Approach: In order to get the lexicographically smallest string, we need to take the minimum character from the first K characters every time we choose a character from str. To do that, we can put the first K characters in a priority_queue (min-heap) and then choose the smallest character and append it to X.

What is the smallest in lexicographical order?

The lexicographically smallest string is “abcde”.

What is lexicographically minimum fashionable string?

The lexicographically minimal string rotation (or lexicographically least circular substring) is the problem of finding a string's rotation possessing the lowest lexicographical order among all possible rotations. For example, the lexicographically minimal rotation of bbaaccaadd is aaccaaddbb .


2 Answers

You can use a modified suffix array. I mean modified because you must not stop on word end.

Here is the code for a similar problem I solved (SA is the suffix array):

//719
//Glass Beads
//Misc;String Matching;Suffix Array;Circular
#include <iostream>
#include <iomanip>
#include <cstring>
#include <string>
#include <cmath>
#define MAX 10050
using namespace std;

int RA[MAX], tempRA[MAX];
int SA[MAX], tempSA[MAX];
int C[MAX];                

void suffix_sort(int n, int k) {
    memset(C, 0, sizeof C);        

    for (int i = 0; i < n; i++)        
        C[RA[(i + k)%n]]++;

    int sum = 0;
    for (int i = 0; i < max(256, n); i++) {                     
        int t = C[i]; 
        C[i] = sum; 
        sum += t;
    }

    for (int i = 0; i < n; i++)        
        tempSA[C[RA[(SA[i] + k)%n]]++] = SA[i];

    memcpy(SA, tempSA, n*sizeof(int));
}

void suffix_array(string &s) {             
    int n = s.size();

    for (int i = 0; i < n; i++) 
        RA[i] = s[i];              

    for (int i = 0; i < n; i++) 
        SA[i] = i;

    for (int k = 1; k < n; k *= 2) {     
        suffix_sort(n, k);
        suffix_sort(n, 0);

        int r = tempRA[SA[0]] = 0;
        for (int i = 1; i < n; i++) {
            int s1 = SA[i], s2 = SA[i-1];
            bool equal = true;
            equal &= RA[s1] == RA[s2];
            equal &= RA[(s1+k)%n] == RA[(s2+k)%n];

            tempRA[SA[i]] = equal ? r : ++r;     
        }

        memcpy(RA, tempRA, n*sizeof(int));
    } 
}

int main() {
    int tt; cin >> tt;
    while(tt--) {
        string s; cin >> s;
        suffix_array(s);
        cout << SA[0]+1 << endl;
   }
}

I took this implementation mostly from this book. There is an easier to write O(n log²n) version, but may not be efficient enough for your case (n=10^5). This version is O(n log n), and it's not the most efficient algorithm. The wikipedia article lists some O(n) algorithms, but I find most of them too complex to write during a programming contest. This O(n log n) is usually enough for most problems.

You can find some slides explaining suffix array concept (from the author of the book I mentioned) here.

like image 74
Juan Lopes Avatar answered Sep 29 '22 17:09

Juan Lopes


I know this comes very late but I stumbled across this from google on my search for an even faster variant of this algorithm. Turns out a good implementation is found at github: https://gist.github.com/MaskRay/8803371

It uses the lyndon factorization. That means it repeatly splits the string into lexicographically decreasing lyndon words. Lyndon word are strings that are (one of) the minimal rotations of themselves. Doing this in a circular way yields the lms of the string as the last found lyndon word.

int lyndon_word(const char *a, int n)
{
  int i = 0, j = 1, k;
  while (j < n) {
    // Invariant: i < j and indices in [0,j) \ i cannot be the first optimum
    for (k = 0; k < n && a[(i+k)%n] == a[(j+k)%n]; k++);
    if (a[(i+k)%n] <= a[(j+k)%n]) {
      // if k < n
      //   foreach p in [j,j+k], s_p > s_{p-(j-i)}
      //   => [j,j+k] are all suboptimal
      //   => indices in [0,j+k+1) \ i are suboptimal
      // else
      //   None of [j,j+k] is the first optimum
      j += k+1;
    } else {
      // foreach p in [i,i+k], s_p > s_{p+(j-i)}
      // => [i,i+k] are all suboptimal
      // => [0,j) and [0,i+k+1) are suboptimal
      // if i+k+1 < j
      //   j < j+1 and indices in [0,j+1) \ j are suboptimal
      // else
      //   i+k+1 < i+k+2 and indices in [0,i+k+2) \ (i+k+1) are suboptimal
      i += k+1;
      if (i < j)
        i = j++;
      else
        j = i+1;
    }
  }
  // j >= n => [0,n) \ i cannot be the first optimum
  return i;
}
like image 34
Christoph Diegelmann Avatar answered Sep 29 '22 17:09

Christoph Diegelmann