I want to find efficiently permutations of a vector which has tied values.
E.g., if perm_vector = [0,0,1,2]
I would want to obtain as output all combinations of [0,0,1,2], [0,0,2,1], [0,1,2,0]
and so on, but I don't want to obtain [0,0,1,2]
twice which is what the standard itertools.permutations(perm_vector)
would give.
I tried the following but it works really SLOW when perm_vector grows
in len:
vectors_list = []
for it in itertools.permutations(perm_vector):
vectors_list.append(list(it))
df_vectors_list = pd.DataFrame( vectors_list)
df_gb = df_vectors_list.groupby(list(df_vectors_list.columns))
vectors_list = pd.DataFrame(df_gb.groups.keys()).T
The question is of more general "speed-up" nature, actually. The main time is spent on creating the permutations of long vectors - even without the duplicity, creation of permutations of a vector of 12 unique values takes a "infinity". Is there a possibility to call the itertools iteratively without accessing the entire permutations data but working on bunches of it?
Try this if perm_vector is small:
import itertools as iter
{x for x in iter.permutations(perm_vector)}
This should give you unique values, because now it becomes a set, which by default delete duplications.
If perm_vector is large, you might want to try backtracking:
def permu(L, left, right, cache):
for i in range(left, right):
L[left], L[i] = L[i], L[left]
L_tuple = tuple(L)
if L_tuple not in cache:
permu(L, left + 1, right, cache)
L[left], L[i] = L[i], L[left]
cache[L_tuple] = 0
cache = {}
permu(perm_vector, 0, len(perm_vector), cache)
cache.keys()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With