Say I have an array a:
a = np.array([[1,2,3], [4,5,6]])
array([[1, 2, 3],
[4, 5, 6]])
I would like to convert it to a 1D array (i.e. a column vector):
b = np.reshape(a, (1,np.product(a.shape)))
but this returns
array([[1, 2, 3, 4, 5, 6]])
which is not the same as:
array([1, 2, 3, 4, 5, 6])
I can take the first element of this array to manually convert it to a 1D array:
b = np.reshape(a, (1,np.product(a.shape)))[0]
but this requires me to know how many dimensions the original array has (and concatenate [0]'s when working with higher dimensions)
Is there a dimensions-independent way of getting a column/row vector from an arbitrary ndarray?
An ndarray is a (usually fixed-size) multidimensional container of items of the same type and size. The number of dimensions and items in an array is defined by its shape , which is a tuple of N non-negative integers that specify the sizes of each dimension.
Use np.ravel (for a 1D view) or np.ndarray.flatten (for a 1D copy) or np.ndarray.flat (for an 1D iterator):
In [12]: a = np.array([[1,2,3], [4,5,6]])
In [13]: b = a.ravel()
In [14]: b
Out[14]: array([1, 2, 3, 4, 5, 6])
Note that ravel() returns a view of a when possible. So modifying b also modifies a. ravel() returns a view when the 1D elements are contiguous in memory, but would return a copy if, for example, a were made from slicing another array using a non-unit step size (e.g. a = x[::2]).
If you want a copy rather than a view, use
In [15]: c = a.flatten()
If you just want an iterator, use np.ndarray.flat:
In [20]: d = a.flat
In [21]: d
Out[21]: <numpy.flatiter object at 0x8ec2068>
In [22]: list(d)
Out[22]: [1, 2, 3, 4, 5, 6]
In [14]: b = np.reshape(a, (np.product(a.shape),))
In [15]: b
Out[15]: array([1, 2, 3, 4, 5, 6])
or, simply:
In [16]: a.flatten()
Out[16]: array([1, 2, 3, 4, 5, 6])
I wanted to see a benchmark result of functions mentioned in answers including unutbu's.
Also want to point out that numpy doc recommend to use arr.reshape(-1) in case view is preferable. (even though ravel is tad faster in the following result)
TL;DR:
np.ravelis the most performant (by very small amount).
Functions:
np.ravel: returns view, if possiblenp.reshape(-1): returns view, if possiblenp.flatten: returns copynp.flat: returns numpy.flatiter. similar to iterable
numpy version: '1.18.0'
ndarray sizes+-------------+----------+-----------+-----------+-------------+
| function | 10x10 | 100x100 | 1000x1000 | 10000x10000 |
+-------------+----------+-----------+-----------+-------------+
| ravel | 0.002073 | 0.002123 | 0.002153 | 0.002077 |
| reshape(-1) | 0.002612 | 0.002635 | 0.002674 | 0.002701 |
| flatten | 0.000810 | 0.007467 | 0.587538 | 107.321913 |
| flat | 0.000337 | 0.000255 | 0.000227 | 0.000216 |
+-------------+----------+-----------+-----------+-------------+
ravelandreshape(-1)'s execution time was consistent and independent from ndarray size. However,ravelis tad faster, butreshapeprovides flexibility in reshaping size. (maybe that's why numpy doc recommend to use it instead. Or there could be some cases wherereshapereturns view andraveldoesn't).
If you are dealing with large size ndarray, usingflattencan cause a performance issue. Recommend not to use it. Unless you need a copy of the data to do something else.
import timeit
setup = '''
import numpy as np
nd = np.random.randint(10, size=(10, 10))
'''
timeit.timeit('nd = np.reshape(nd, -1)', setup=setup, number=1000)
timeit.timeit('nd = np.ravel(nd)', setup=setup, number=1000)
timeit.timeit('nd = nd.flatten()', setup=setup, number=1000)
timeit.timeit('nd.flat', setup=setup, number=1000)
import numpy as np
# ND array list with different size
a = [[1],[2,3,4,5],[6,7,8]]
# stack them
b = np.hstack(a)
print(b)
[1 2 3 4 5 6 7 8]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With