I would like to implement itertools.combinations for numpy. Based on this discussion, I have a function that works for 1D input:
def combs(a, r): """ Return successive r-length combinations of elements in the array a. Should produce the same output as array(list(combinations(a, r))), but faster. """ a = asarray(a) dt = dtype([('', a.dtype)]*r) b = fromiter(combinations(a, r), dt) return b.view(a.dtype).reshape(-1, r)
and the output makes sense:
In [1]: list(combinations([1,2,3], 2)) Out[1]: [(1, 2), (1, 3), (2, 3)] In [2]: array(list(combinations([1,2,3], 2))) Out[2]: array([[1, 2], [1, 3], [2, 3]]) In [3]: combs([1,2,3], 2) Out[3]: array([[1, 2], [1, 3], [2, 3]])
However, it would be best if I could expand it to N-D inputs, where additional dimensions simply allow you to speedily do multiple calls at once. So, conceptually, if combs([1, 2, 3], 2)
produces [1, 2], [1, 3], [2, 3]
, and combs([4, 5, 6], 2)
produces [4, 5], [4, 6], [5, 6]
, then combs((1,2,3) and (4,5,6), 2)
should produce [1, 2], [1, 3], [2, 3] and [4, 5], [4, 6], [5, 6]
where "and" just represents parallel rows or columns (whichever makes sense). (and likewise for additional dimensions)
I'm not sure:
axis=
parameter, and a default of axis 0. So probably axis 0 should be the one I am combining along, and all other axes just represent parallel calculations?)ValueError: setting an array element with a sequence.
)dt = dtype([('', a.dtype)]*r)
?To create combinations without using itertools, iterate the list one by one and fix the first element of the list and make combinations with the remaining list. Similarly, iterate with all the list elements one by one by recursion of the remaining list.
Get NumPy Array Combinations With the itertools. product() Function in Python. The itertools package provides many functions related to combination and permutation. We can use the itertools.
itertools. combinations (iterable, r) Return r length subsequences of elements from the input iterable. The combination tuples are emitted in lexicographic ordering according to the order of the input iterable. So, if the input iterable is sorted, the combination tuples will be produced in sorted order.
Another approach would be to generate combinations at random, ensuring there are no duplicates. This uses the random_combination function from answers to this question, which in turn comes from the itertools documentation. The code generates 100,000 unique 4x4 samples in about ten seconds, at least on my machine.
You can use itertools.combinations()
to create the index array, and then use NumPy's fancy indexing:
import numpy as np from itertools import combinations, chain from scipy.special import comb def comb_index(n, k): count = comb(n, k, exact=True) index = np.fromiter(chain.from_iterable(combinations(range(n), k)), int, count=count*k) return index.reshape(-1, k) data = np.array([[1,2,3,4,5],[10,11,12,13,14]]) idx = comb_index(5, 3) print(data[:, idx])
output:
[[[ 1 2 3] [ 1 2 4] [ 1 2 5] [ 1 3 4] [ 1 3 5] [ 1 4 5] [ 2 3 4] [ 2 3 5] [ 2 4 5] [ 3 4 5]] [[10 11 12] [10 11 13] [10 11 14] [10 12 13] [10 12 14] [10 13 14] [11 12 13] [11 12 14] [11 13 14] [12 13 14]]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With