Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating Nth permutation step?

I have a char[26] of the letters a-z and via nested for statements I'm producing a list of sequences like:

aaa, aaz... aba, abb, abz, ... zzy, zzz.

Currently, the software is written to generate the list of all possible values from aaa-zzz and then maintains an index, and goes through each of them performing an operation on them.

The list is obviously large, it's not ridiculously large, but it's gotten to the point where the memory footprint is too large (there are also other areas being looked at, but this is one that I've got).

I'm trying to produce a formula where I can keep the index, but do away with the list of sequences and calculate the current sequence based on the current index (as the time between operations between sequences is long).

Eg:

char[] characters = {a, b, c... z};
int currentIndex = 29; // abd

public string CurrentSequence(int currentIndex)
{
    int ndx1 = getIndex1(currentIndex); // = 0
    int ndx2 = getIndex2(currentIndex); // = 1
    int ndx3 = getIndex3(currentIndex); // = 3

    return string.Format(
        "{0}{1}{2}", 
        characters[ndx1], 
        characters[ndx2], 
        characters[ndx3]); // abd
}

I've tried working out a small example using a subset (abc) and trying to index into that using modulo division, but I'm having trouble thinking today and I'm stumped.

I'm not asking for an answer, just any kind of help. Maybe a kick in the right direction?

like image 588
Steven Evers Avatar asked Aug 24 '10 21:08

Steven Evers


3 Answers

Hint: Think of how you would print a number in base 26 instead of base 10, and with letters instead of digits. What's the general algorithm for displaying a number in an arbitrary base?

Spoiler: (scroll right to view)

                                                                                      int ndx1 = currentIndex / 26 / 26 % 26;
                                                                                      int ndx2 = currentIndex / 26 % 26;
                                                                                      int ndx3 = currentIndex % 26;
like image 181
John Kugelman Avatar answered Nov 10 '22 12:11

John Kugelman


Something like this ought to work, assuming 26 characters:

public string CurrentSequence(int currentIndex) {
    return characters[currentIndex / (26 * 26)] 
        + characters[(currentIndex / 26) % 26]
        + characters[currentIndex % 26];
}
like image 38
recursive Avatar answered Nov 10 '22 12:11

recursive


Wow, two questions in one day that can be solved via Cartesian products. Amazing.

You can use Eric Lippert's LINQ snippet to generate all combinations of the index values. This approach results in a streaming set of values, so they don't require storage in memory. This approach nicely separates the logic of generating the codes from maintaining state or performing computation with the code.

Eric's code for all combinations:

static IEnumerable<IEnumerable<T>> CartesianProduct<T>(this IEnumerable<IEnumerable<T>> sequences)  
{  
  IEnumerable<IEnumerable<T>> emptyProduct = new[] { Enumerable.Empty<T>() };  
  return sequences.Aggregate(  
    emptyProduct,  
    (accumulator, sequence) =>   
      from accseq in accumulator   
      from item in sequence   
      select accseq.Concat(new[] {item}));                 
} 

You can now write:

public static IEnumerable<string> AllCodes()
{
  char[] characters = {a, b, c... z}; 
  IEnumerable<char[]> codeSets = new[] { characters, characters, characters };

  foreach( var codeValues in codeSets.CartesianProduct() )
  {
    yield return 
       string.Format( "{0}{1}{2}", codeValues[0], codeValues[1], codeValues[2]);
  }
}

The code above generates a streaming sequence of all code strings from aaa to zzz. You can now use this elsewhere where you perform your processing:

foreach( var code in AllCodes() )
{
    // use the code value somehow...
}
like image 31
LBushkin Avatar answered Nov 10 '22 14:11

LBushkin