Pythonic way to create a numpy array from a list of numpy arrays

Tags:

I generate a list of one dimensional numpy arrays in a loop and later convert this list to a 2d numpy array. I would've preallocated a 2d numpy array if i knew the number of items ahead of time, but I don't, therefore I put everything in a list.

The mock up is below:

>>> list_of_arrays = map(lambda x: x*ones(2), range(5)) >>> list_of_arrays [array([ 0.,  0.]), array([ 1.,  1.]), array([ 2.,  2.]), array([ 3.,  3.]), array([ 4.,  4.])] >>> arr = array(list_of_arrays) >>> arr array([[ 0.,  0.],        [ 1.,  1.],        [ 2.,  2.],        [ 3.,  3.],        [ 4.,  4.]])

My question is the following:

Is there a better way (performancewise) to go about the task of collecting sequential numerical data (in my case numpy arrays) than putting them in a list and then making a numpy.array out of it (I am creating a new obj and copying the data)? Is there an "expandable" matrix data structure available in a well tested module?

A typical size of my 2d matrix would be between 100x10 and 5000x10 floats

EDIT: In this example i'm using map, but in my actual application I have a for loop

826

asked Jan 21 '10 01:01

AnalyticsBuilder

1 Answers

Convenient way, using numpy.concatenate. I believe it's also faster, than @unutbu's answer:

In [32]: import numpy as np   In [33]: list_of_arrays = list(map(lambda x: x * np.ones(2), range(5)))  In [34]: list_of_arrays Out[34]:  [array([ 0.,  0.]),  array([ 1.,  1.]),  array([ 2.,  2.]),  array([ 3.,  3.]),  array([ 4.,  4.])]  In [37]: shape = list(list_of_arrays[0].shape)  In [38]: shape Out[38]: [2]  In [39]: shape[:0] = [len(list_of_arrays)]  In [40]: shape Out[40]: [5, 2]  In [41]: arr = np.concatenate(list_of_arrays).reshape(shape)  In [42]: arr Out[42]:  array([[ 0.,  0.],        [ 1.,  1.],        [ 2.,  2.],        [ 3.,  3.],        [ 4.,  4.]])

194

answered Sep 20 '22 23:09

Gill Bates

Related questions
                            
                                How to avoid [Errno 12] Cannot allocate memory errors caused by using subprocess module
                            
                                Can't concat bytes to str
                            
                                How to run --upgrade with pipenv?
                            
                                Dictionary infinite loop is exiting unexpectedly
                            
                                How to avoid writing request.GET.get() twice in order to print it?
                            
                                Selecting fields from JSON output
                            
                                Why were True and False changed to keywords in Python 3
                            
                                In which order are pytest fixtures executed?
                            
                                Where do you need to use lit() in Pyspark SQL?
                            
                                "OverflowError: Python int too large to convert to C long" on windows but not mac
                            
                                What is the difference between C.UTF-8 and en_US.UTF-8 locales?
                            
                                In pdb how do you reset the list (l) command line count?
                            
                                Pointers and arrays in Python ctypes
                            
                                What's the best way to sum all values in a Pandas dataframe?
                            
                                Why is scikit-learn SVM.SVC() extremely slow?
                            
                                Delete file from zipfile with the ZipFile Module
                            
                                Difference between scipy.spatial.KDTree and scipy.spatial.cKDTree
                            
                                Define an order for ManyToManyField with Django
                            
                                Can subprocess.call be invoked without waiting for process to finish?
                            
                                Tensorflow variable scope: reuse if variable exists

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pythonic way to create a numpy array from a list of numpy arrays

Tags:

performance

python

arrays

numpy

scipy

AnalyticsBuilder

People also ask

1 Answers

Gill Bates

Recent Activity

Donate For Us