I have:
test = np.random.randn(40,40,3)
And I want to make:
result = Repeat(test, 10)
So that result
contains the array test
repeated 10 times, with shape:
(10, 40, 40, 3)
So create a tensor with a new axis to hold 10 copies of test
. I also want to do this as efficiently as possible. How can I do this with Numpy?
The repeat() function is used to repeat elements of an array. Input array. The number of repetitions for each element. repeats is broadcasted to fit the shape of the given axis.
In Python, if you want to repeat the elements multiple times in the NumPy array then you can use the numpy. repeat() function. In Python, this method is available in the NumPy module and this function is used to return the numpy array of the repeated items along with axis such as 0 and 1.
The NumPy repeat function essentially repeats the numbers inside of an array. It repeats the individual elements of an array. Having said that, the behavior of NumPy repeat is a little hard to understand sometimes.
One can use np.repeat
methods together with np.newaxis
:
import numpy as np
test = np.random.randn(40,40,3)
result = np.repeat(test[np.newaxis,...], 10, axis=0)
print(result.shape)
>> (10, 40, 40, 3)
Assuming you're looking to copy the values 10 times, you can just stack
10 of the array:
def repeat(arr, count):
return np.stack([arr for _ in range(count)], axis=0)
axis=0
is actually the default, so it's not really necessary here, but I think it makes it clearer that you're adding the new axis on the front.
In fact, this is pretty much identical to what the examples for stack
are doing:
>>> arrays = [np.random.randn(3, 4) for _ in range(10)]
>>> np.stack(arrays, axis=0).shape
(10, 3, 4)
At first glance you might think repeat
or tile
would be a better fit.
But repeat
is about repeating over an existing axis (or flattening the array), so you'd need to reshape
either before or after. (Which is just as efficient, but I think not as simple.)
And tile
(assuming you use an array-like reps
—with scalar reps
it basically repeat
) is about filling out a multidimensional spec in all directions, which is much more complex than what you want for this simple case.
All of these options will be equally efficient. They all copy the data 10 times over, which is the expensive part; the cost of any internal processing, building tiny intermediate objects, etc. is irrelevant. The only way to make it faster is to avoid copying. Which you probably don't want to do.
But if you do… To share row storage across the 10 copies, you probably want broadcast_to
:
def repeat(arr, count):
return np.broadcast_to(arr, (count,)+arr.shape)
Notice that broadcast_to
doesn't actually guarantee that it avoids copying, just that it returns some kind of readonly view where "more than one element of a broadcasted array may refer to a single memory location". In practice, it's going to avoid copying. If you actually need that to be guaranteed for some reason (or if you want a writable view—which is usually going to be a terrible idea, but maybe you have a good reason…), you have to drop down to as_strided
:
def repeat(arr, count):
shape = (count,) + arr.shape
strides = (0,) + arr.strides
return np.lib.stride_tricks.as_strided(
arr, shape=shape, strides=strides, writeable=False)
Notice that half the docs for as_strided
are warning that you probably shouldn't use it, and the other half are warning that you definitely shouldn't use it for writable views, so… make sure this is what you want before doing it.
Of the many ways of creating a proper copy, preallocation + broadcasting seems fastest.
import numpy as np
def f_pp_0():
out = np.empty((10, *a.shape), a.dtype)
out[...] = a
return out
def f_pp_1():
out = np.empty((10, *a.shape), a.dtype)
np.copyto(out, a)
return out
def f_oddn():
return np.repeat(a[np.newaxis,...], 10, axis=0)
def f_abar():
return np.stack([a for _ in range(10)], axis=0)
def f_arry():
return np.array(10*[a])
from timeit import timeit
a = np.random.random((40, 40, 3))
for f in list(locals().values()):
if callable(f) and f.__name__.startswith('f_'):
print(f.__name__, timeit(f, number=100000)/100, 'ms')
Sample run:
f_pp_0 0.019641224660445003 ms
f_pp_1 0.019557840081397444 ms
f_oddn 0.01983011547010392 ms
f_abar 0.03257150553865358 ms
f_arry 0.02305851033888757 ms
But differences are small, for example repeat
is hardly slower if at all.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With