Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

number to unique permutation mapping of a sequence containing duplicates

I am looking for an algorithm that can map a number to a unique permutation of a sequence. I have found out about Lehmer codes and the factorial number system thanks to a similar question, Fast permutation -> number -> permutation mapping algorithms, but that question doesn't deal with the case where there are duplicate elements in the sequence.

For example, take the sequence 'AAABBC'. There are 6! = 720 ways that could be arranged, but I believe there are only 6! / (3! * 2! * 1!) = 60 unique permutation of this sequence. How can I map a number to a permutation in these cases?

Edit: changed the term 'set' to 'sequence'.

like image 821
frnknstn Avatar asked Jan 08 '13 09:01

frnknstn


People also ask

How do you calculate permutations with duplicates?

There is a subset of permutations that takes into account that there are double objects or repetitions in a permutation problem. In general, repetitions are taken care of by dividing the permutation by the factorial of the number of objects that are identical.


2 Answers

From Permutation to Number:

Let K be the number of character classes (example: AAABBC has three character classes)

Let N[K] be the number of elements in each character class. (example: for AAABBC, we have N[K]=[3,2,1], and let N= sum(N[K])

Every legal permutation of the sequence then uniquely corresponds to a path in an incomplete K-way tree.

The unique number of the permutation then corresponds to the index of the tree-node in a post-order traversal of the K-ary tree terminal nodes.

Luckily, we don't actually have to perform the tree traversal -- we just need to know how many terminal nodes in the tree are lexicographically less than our node. This is very easy to compute, as at any node in the tree, the number terminal nodes below the current node is equal to the number of permutations using the unused elements in the sequence, which has a closed form solution that is a simple multiplication of factorials.

So given our 6 original letters, and the first element of our permutation is a 'B', we determine that there will be 5!/3!1!1! = 20 elements that started with 'A', so our permutation number has to be greater than 20. Had our first letter been a 'C', we could have calculated it as 5!/2!2!1! (not A) + 5!/3!1!1! (not B) = 30+ 20, or alternatively as 60 (total) - 5!/3!2!0! (C) = 50

Using this, we can take a permutation (e.g. 'BAABCA') and perform the following computations: Permuation #= (5!/2!2!1!) ('B') + 0('A') + 0('A')+ 3!/1!1!1! ('B') + 2!/1!

= 30 + 3 +2 = 35

Checking that this works: CBBAAA corresponds to

(5!/2!2!1! (not A) + 5!/3!1!1! (not B)) 'C'+ 4!/2!2!0! (not A) 'B' + 3!/2!1!0! (not A) 'B' = (30 + 20) +6 + 3 = 59

Likewise, AAABBC = 0 ('A') + 0 'A' + '0' A' + 0 'B' + 0 'B' + 0 'C = 0

Sample implementation:

import math
import copy
from operator import mul

def computePermutationNumber(inPerm, inCharClasses):
    permutation=copy.copy(inPerm)
    charClasses=copy.copy(inCharClasses)

    n=len(permutation)
    permNumber=0
    for i,x in enumerate(permutation):
        for j in xrange(x):
            if( charClasses[j]>0):
                charClasses[j]-=1
                permNumber+=multiFactorial(n-i-1, charClasses)
                charClasses[j]+=1
        if charClasses[x]>0:
            charClasses[x]-=1
    return permNumber

def multiFactorial(n, charClasses):
    val= math.factorial(n)/ reduce(mul, (map(lambda x: math.factorial(x), charClasses)))
    return val

From Number to Permutation: This process can be done in reverse, though I'm not sure how efficiently: Given a permutation number, and the alphabet that it was generated from, recursively subtract the largest number of nodes less than or equal to the remaining permutation number.

E.g. Given a permutation number of 59, we first can subtract 30 + 20 = 50 ('C') leaving 9. Then we can subtract 'B' (6) and a second 'B'(3), re-generating our original permutation.

like image 117
gbronner Avatar answered Oct 06 '22 01:10

gbronner


Here is an algorithm in Java that enumerates the possible sequences by mapping an integer to the sequence.

public class Main {

    private int[] counts = { 3, 2, 1 }; // 3 Symbols A, 2 Symbols B, 1 Symbol C
    private int n = sum(counts);

    public static void main(String[] args) {
        new Main().enumerate();
    }

    private void enumerate() {
        int s = size(counts);
        for (int i = 0; i < s; ++i) {
            String p = perm(i);
            System.out.printf("%4d -> %s\n", i, p);
        }

    }

    // calculates the total number of symbols still to be placed
    private int sum(int[] counts) {
        int n = 0;
        for (int i = 0; i < counts.length; i++) {
            n += counts[i];
        }
        return n;
    }

    // calculates the number of different sequences with the symbol configuration in counts
    private int size(int[] counts) {
        int res = 1;
        int num = 0;
        for (int pos = 0; pos < counts.length; pos++) {
            for (int den = 1; den <= counts[pos]; den++) {
                res *= ++num;
                res /= den;
            }
        }
        return res;
    }

    // maps the sequence number to a sequence
    private String perm(int num) {
        int[] counts = this.counts.clone();
        StringBuilder sb = new StringBuilder(n);
        for (int i = 0; i < n; ++i) {
            int p = 0;
            for (;;) {
                while (counts[p] == 0) {
                    p++;
                }
                counts[p]--;
                int c = size(counts);
                if (c > num) {
                    sb.append((char) ('A' + p));
                    break;
                }
                counts[p]++;
                num -= c;
                p++;
            }
        }
        return sb.toString();
    }

}

The mapping used by the algorithm is as follows. I use the example given in the question (3 x A, 2 x B, 1 x C) to illustrate it.

There are 60 (=6!/3!/2!/1!) possible sequences in total, 30 (=5!/2!/2!/1!) of them have an A at the first place, 20 (=5!/3!/1!/1!) have a B at the first place, and 10 (=5!/3!/2!/0!) have a C at the first place.

The numbers 0..29 are mapped to all sequences starting with an A, 30..49 are mapped to the sequences starting with B, and 50..59 are mapped to the sequences starting with C.

The same process is repeated for the next place in the sequence, for example if we take the sequences starting with B we have now to map numbers 0 (=30-30) .. 19 (=49-30) to the sequences with configuration (3 x A, 1 x B, 1 x C)

like image 31
Henry Avatar answered Oct 06 '22 01:10

Henry