Is there a better way in numpy
to tile an array a non-integer number of times? This gets the job done, but is clunky and doesn't easily generalize to n-dimensions:
import numpy as np
arr = np.arange(6).reshape((2, 3))
desired_shape = (5, 8)
reps = tuple([x // y for x, y in zip(desired_shape, arr.shape)])
left = tuple([x % y for x, y in zip(desired_shape, arr.shape)])
tmp = np.tile(arr, reps)
tmp = np.r_[tmp, tmp[slice(left[0]), :]]
tmp = np.c_[tmp, tmp[:, slice(left[1])]]
this yields:
array([[0, 1, 2, 0, 1, 2, 0, 1],
[3, 4, 5, 3, 4, 5, 3, 4],
[0, 1, 2, 0, 1, 2, 0, 1],
[3, 4, 5, 3, 4, 5, 3, 4],
[0, 1, 2, 0, 1, 2, 0, 1]])
EDIT: Performance results
Some test of the three answers that were generalized to n-dimensions. These definitions were put in a file newtile.py
:
import numpy as np
def tile_pad(a, dims):
return np.pad(a, tuple((0, i) for i in (np.array(dims) - a.shape)),
mode='wrap')
def tile_meshgrid(a, dims):
return a[np.meshgrid(*[np.arange(j) % k for j, k in zip(dims, a.shape)],
sparse=True, indexing='ij')]
def tile_rav_mult_idx(a, dims):
return a.flat[np.ravel_multi_index(np.indices(dims), a.shape, mode='wrap')]
Here are the bash lines:
python -m timeit -s 'import numpy as np' 'import newtile' 'newtile.tile_pad(np.arange(30).reshape(2, 3, 5), (3, 5, 7))'
python -m timeit -s 'import numpy as np' 'import newtile' 'newtile.tile_meshgrid(np.arange(30).reshape(2, 3, 5), (3, 5, 7))'
python -m timeit -s 'import numpy as np' 'import newtile' 'newtile.tile_rav_mult_idx(np.arange(30).reshape(2, 3, 5), (3, 5, 7))'
python -m timeit -s 'import numpy as np' 'import newtile' 'newtile.tile_pad(np.arange(2310).reshape(2, 3, 5, 7, 11), (13, 17, 19, 23, 29))'
python -m timeit -s 'import numpy as np' 'import newtile' 'newtile.tile_meshgrid(np.arange(2310).reshape(2, 3, 5, 7, 11), (13, 17, 19, 23, 29))'
python -m timeit -s 'import numpy as np' 'import newtile' 'newtile.tile_rav_mult_idx(np.arange(2310).reshape(2, 3, 5, 7, 11), (13, 17, 19, 23, 29))'
Here are the results with small arrays (2 x 3 x 5):
pad: 10000 loops, best of 3: 106 usec per loop
meshgrid: 10000 loops, best of 3: 56.4 usec per loop
ravel_multi_index: 10000 loops, best of 3: 50.2 usec per loop
Here are the results with larger arrays (2 x 3 x 5 x 7 x 11):
pad: 10 loops, best of 3: 25.2 msec per loop
meshgrid: 10 loops, best of 3: 300 msec per loop
ravel_multi_index: 10 loops, best of 3: 218 msec per loop
So the method using np.pad
is probably the most performant choice.
Another solution which is even more concise:
arr = np.arange(6).reshape((2, 3))
desired_shape = np.array((5, 8))
pads = tuple((0, i) for i in (desired_shape-arr.shape))
# pads = ((0, add_rows), (0, add_columns), ...)
np.pad(arr, pads, mode="wrap")
but it is slower for small arrays (much faster for large ones though). Strangely, np.pad won't accept np.array for pads.
Here's a pretty concise method:
In [57]: a
Out[57]:
array([[0, 1, 2],
[3, 4, 5]])
In [58]: old = a.shape
In [59]: new = (5, 8)
In [60]: a[(np.arange(new[0]) % old[0])[:,None], np.arange(new[1]) % old[1]]
Out[60]:
array([[0, 1, 2, 0, 1, 2, 0, 1],
[3, 4, 5, 3, 4, 5, 3, 4],
[0, 1, 2, 0, 1, 2, 0, 1],
[3, 4, 5, 3, 4, 5, 3, 4],
[0, 1, 2, 0, 1, 2, 0, 1]])
Here's an n-dimensional generalization:
def rep_shape(a, shape):
indices = np.meshgrid(*[np.arange(k) % j for j, k in zip(a.shape, shape)],
sparse=True, indexing='ij')
return a[indices]
For example:
In [89]: a
Out[89]:
array([[0, 1, 2],
[3, 4, 5]])
In [90]: rep_shape(a, (5, 8))
Out[90]:
array([[0, 1, 2, 0, 1, 2, 0, 1],
[3, 4, 5, 3, 4, 5, 3, 4],
[0, 1, 2, 0, 1, 2, 0, 1],
[3, 4, 5, 3, 4, 5, 3, 4],
[0, 1, 2, 0, 1, 2, 0, 1]])
In [91]: rep_shape(a, (4, 2))
Out[91]:
array([[0, 1],
[3, 4],
[0, 1],
[3, 4]])
In [92]: b = np.arange(24).reshape(2,3,4)
In [93]: b
Out[93]:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
In [94]: rep_shape(b, (3,4,5))
Out[94]:
array([[[ 0, 1, 2, 3, 0],
[ 4, 5, 6, 7, 4],
[ 8, 9, 10, 11, 8],
[ 0, 1, 2, 3, 0]],
[[12, 13, 14, 15, 12],
[16, 17, 18, 19, 16],
[20, 21, 22, 23, 20],
[12, 13, 14, 15, 12]],
[[ 0, 1, 2, 3, 0],
[ 4, 5, 6, 7, 4],
[ 8, 9, 10, 11, 8],
[ 0, 1, 2, 3, 0]]])
Here's how the first example works...
The idea is to use arrays to index a
. Take a look at np.arange(new[0] % old[0])
:
In [61]: np.arange(new[0]) % old[0]
Out[61]: array([0, 1, 0, 1, 0])
Each value in that array gives the row of a
to use in the result. Similary,
In [62]: np.arange(new[1]) % old[1]
Out[62]: array([0, 1, 2, 0, 1, 2, 0, 1])
gives the columns of a
to use in the result. For these index arrays to create a 2-d result, we have to reshape the first one into a column:
In [63]: (np.arange(new[0]) % old[0])[:,None]
Out[63]:
array([[0],
[1],
[0],
[1],
[0]])
When arrays are used as indices, they broadcast. Here's what the broadcast indices look like:
n [65]: i, j = np.broadcast_arrays((np.arange(new[0]) % old[0])[:,None], np.arange(new[1]) % old[1])
In [66]: i
Out[66]:
array([[0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1, 1, 1],
[0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1, 1, 1],
[0, 0, 0, 0, 0, 0, 0, 0]])
In [67]: j
Out[67]:
array([[0, 1, 2, 0, 1, 2, 0, 1],
[0, 1, 2, 0, 1, 2, 0, 1],
[0, 1, 2, 0, 1, 2, 0, 1],
[0, 1, 2, 0, 1, 2, 0, 1],
[0, 1, 2, 0, 1, 2, 0, 1]])
These are the index array that we need to generate the array with shape (5, 8):
In [68]: a[i,j]
Out[68]:
array([[0, 1, 2, 0, 1, 2, 0, 1],
[3, 4, 5, 3, 4, 5, 3, 4],
[0, 1, 2, 0, 1, 2, 0, 1],
[3, 4, 5, 3, 4, 5, 3, 4],
[0, 1, 2, 0, 1, 2, 0, 1]])
When index arrays are given as in the example at the beginning (i.e. using (np.arange(new[0]) % old[0])[:,None]
in the first index slot), numpy doesn't actually generate these index arrays in memory like I did with i
and j
. i
and j
show the effective contents when broadcasting occurs.
The function rep_shape
does the same thing, using np.meshgrid
to generate the index arrays for each "slot" with the correct shapes for broadcasting.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With