Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Flatten a list containing numpy arrays with different shapes

I am trying to find a solution for flattening the following lists of numpy arrays:

a = np.arange(9).reshape(3,3)
b = np.arange(25).reshape(5,5)
c = np.arange(4).reshape(2,2)
myarrs = [a,b,c]

d = np.arange(5*5*5).reshape(5,5,5)
myarrs2 = [a,b,c,d]

For my myarrs I am using at the moment:

res = np.hstack([np.hstack(i) for i in myarrs])

But I was wondering if there are any other built-in methods for performing this task in particular in case of arrays with different shapes. I saw the other questions: Flattening a list of NumPy arrays? but they usually refer to arrays with the same shape.

like image 472
G M Avatar asked Mar 03 '23 00:03

G M


2 Answers

You could try something like:

np.concatenate([x.ravel() for x in myarrs])

This should be faster than your approach:

a = np.arange(9).reshape(3,3)
b = np.arange(25).reshape(5,5)
c = np.arange(4).reshape(2,2)
myarrs = [a,b,c]


res = np.concatenate([x.ravel() for x in myarrs])
print(res)
# [ 0  1  2  3  4  5  6  7  8  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24  0  1  2  3]


%timeit np.concatenate([x.ravel() for x in myarrs])
# 100000 loops, best of 3: 2.47 µs per loop
%timeit np.concatenate(list(map(lambda x: x.ravel(), myarrs)))
# 100000 loops, best of 3: 2.85 µs per loop
%timeit np.concatenate([x.flatten() for x in myarrs])
# 100000 loops, best of 3: 3.69 µs per loop
%timeit np.hstack([x.ravel() for x in myarrs])
# 100000 loops, best of 3: 5.69 µs per loop
%timeit np.hstack([np.hstack(i) for i in myarrs])
# 10000 loops, best of 3: 29.1 µs per loop
like image 151
norok2 Avatar answered May 03 '23 17:05

norok2


I understand that you are looking for numpy only solution. However, if allowed, one more possibility is to use more_itertools together with ravel() or reshape(-1) or flatten():

>>> import more_itertools
>>> list(more_itertools.flatten(([x.reshape(-1) for x in myarrs])))

Comparision:

Original solution

%timeit -n 100000 np.hstack([np.hstack(i) for i in myarrs])
> 31.6 µs ± 1.16 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Solution from norok2 (fastest)

%timeit -n 100000 np.concatenate([x.ravel() for x in myarrs])
> 2.62 µs ± 54.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Solution with more_itertools + reshape(-1)

%timeit -n 100000 list(more_itertools.flatten(([x.reshape(-1) for x in myarrs])))
> 9.32 µs ± 255 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Solution with more_itertools + ravel()

%timeit -n 100000 list(more_itertools.flatten(([x.ravel() for x in myarrs])))
> 7.33 µs ± 235 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Solution with more_itertools + flatten()

%timeit list(more_itertools.flatten(([x.flatten() for x in myarrs])))
> 8.3 µs ± 65.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
like image 23
Grayrigel Avatar answered May 03 '23 16:05

Grayrigel