Suppose a = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6]
and s = [3, 3, 9, 3, 6, 3]
. I'm looking for the best way to repeat a[i]
exactly s[i]
times and then have a flatten array in the form of b = [0.1, 0.1, 0.1, 0.2, 0.2, 0.2, 0.3, 0.3, 0.3, 0.3, ... ]
.
I want to do this as fast as possible since I have to do it many times. I'm using Python and numpy and the arrays are defined as numpy.ndarray. I searched around and find out about repeat
, tile
and column_stack
which can be used nicely to repeat each element n
times but I wanted to repeat each of them different times.
One way to do this is:
a = hsplit(a, 6)
for i in range(len(a)):
a[i] = repeat(a[i], s[i])
a = a.flatten()
I am wondering if there is a better way to do it.
The repeat() function is used to repeat elements of an array. Input array. The number of repetitions for each element. repeats is broadcasted to fit the shape of the given axis.
The np. repeat function repeats the individual elements of an input array. But np. tile will take the entire array – including the order of the individual elements – and copy it in a particular direction.
To stack masked arrays in sequence depth wise (along third axis), use the ma. dstack() method in Python Numpy. This is equivalent to concatenation along the third axis after 2-D arrays of shape (M,N) have been reshaped to (M,N,1) and 1-D arrays of shape (N,) have been reshaped to (1,N,1).
That's exactly what numpy.repeat
does:
>>> a = np.array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6])
>>> s = np.array([3, 3, 9, 3, 6, 3])
>>> np.repeat(a, s)
array([ 0.1, 0.1, 0.1, 0.2, 0.2, 0.2, 0.3, 0.3, 0.3, 0.3, 0.3,
0.3, 0.3, 0.3, 0.3, 0.4, 0.4, 0.4, 0.5, 0.5, 0.5, 0.5,
0.5, 0.5, 0.6, 0.6, 0.6])
In pure Python you can do something like:
>>> from itertools import repeat, chain, imap
>>> list(chain.from_iterable(imap(repeat, a, s)))
[0.1, 0.1, 0.1, 0.2, 0.2, 0.2, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.4, 0.4, 0.4, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.6, 0.6, 0.6]
But of course it is going to be way slower than its NumPy equivalent:
>>> s = [3, 3, 9, 3, 6, 3]*1000
>>> a = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6]*1000
>>> %timeit list(chain.from_iterable(imap(repeat, a, s)))
1000 loops, best of 3: 1.21 ms per loop
>>> %timeit np.repeat(a_a, s_a) #a_a and s_a are NumPy arrays of same size as a and b
10000 loops, best of 3: 202 µs per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With