Efficient creation of numpy arrays from list comprehension and in general

Tags:

In my current work, I use Numpy and list comprehensions a lot and in the interest of the best possible performance I have the following questions:

What actually happens behind the scenes if I create a Numpy array as follows?

a = numpy.array( [1,2,3,4] )

My guess is that python first creates an ordinary list containing the values, then uses the list size to allocate a numpy array and afterwards copies the values into this new array. Is this correct, or is the interpreter clever enough to realize that the list is only intermediary and instead copy the values directly?

Similarly, if i wish to create a numpy array from list comprehension using numpy.fromiter():

a = numpy.fromiter( [ x for x in xrange(0,4) ], int )

will this result in an intermediary list of values being created before being fed into fromiter()?

660

asked Jan 17 '13 05:01

NielsGM

3 Answers

I believe than answer you are looking for is using generator expressions with numpy.fromiter.

numpy.fromiter((<some_func>(x) for x in <something>),<dtype>,<size of something>)

Generator expressions are lazy - they evaluate the expression when you iterate through them.

Using list comprehensions makes the list, then feeds it into numpy, while generator expressions will yield one at a time.

Python evaluates things inside -> out, like most languages (if not all), so using [<something> for <something_else> in <something_different>] would make the list, then iterate over it.

113

answered Sep 26 '22 10:09

Snakes and Coffee

You could create your own list and experiment with it to shed some light on the situation...

>>> class my_list(list):
...     def __init__(self, arg):
...         print 'spam'
...         super(my_list, self).__init__(arg)
...   def __len__(self):
...       print 'eggs'
...       return super(my_list, self).__len__()
... 
>>> x = my_list([0,1,2,3])
spam
>>> len(x)
eggs
4
>>> import numpy as np
>>> np.array(x)
eggs
eggs
eggs
eggs
array([0, 1, 2, 3])
>>> np.fromiter(x, int)
array([0, 1, 2, 3])
>>> np.array(my_list([0,1,2,3]))
spam
eggs
eggs
eggs
eggs
array([0, 1, 2, 3])

answered Sep 26 '22 10:09

wim

To the question in the title, there is now a package called numba which supports numpy array comprehension, which directly constructs the numpy array without intermediate python lists. Unlike numpy.fromiter, it also supports nested comprehension. However, bear in mind that there are some restrictions and performance quirks with numba if you are not familiar with it.

That said, it can be quite fast and efficient, but if you can write it using numpy's vector operations it may be better to keep it simpler.

>>> from timeit import timeit
>>> # using list comprehension
>>> timeit("np.array([i*i for i in range(1000)])", "import numpy as np", number=1000)
2.544344299999999
>>> # using numpy operations
>>> timeit("np.arange(1000) ** 2", "import numpy as np", number=1000)
0.05207519999999022
>>> # using numpy.fromiter
>>> timeit("np.fromiter((i*i for i in range(1000)), dtype=int, count=1000)",
...        "import numpy as np",
...        number=1000)
1.087984500000175
>>> # using numba array comprehension
>>> timeit("squares(1000)",
... """
... import numpy as np
... import numba as nb
... 
... @nb.njit
... def squares(n):
...     return np.array([i*i for i in range(n)])
... 
... 'compile the function'
... squares(10)
... """,
... number=1000)
0.03716940000003888

answered Sep 24 '22 10:09

Simply Beautiful Art

Related questions
                            
                                Plot dendrogram using sklearn.AgglomerativeClustering
                            
                                Difference between Python float and numpy float32
                            
                                Namespace vs regular package
                            
                                Python: how does the functools cmp_to_key function works?
                            
                                Python Twitter library: which one? [closed]
                            
                                Advice on Python/Django and message queues [closed]
                            
                                Python: sort function breaks in the presence of nan
                            
                                How to express multiple types for a single parameter or a return value in docstrings that are processed by Sphinx?
                            
                                What is the Difference between file_upload() and put_object() when uploading files to S3 using boto3
                            
                                Using pytest with a src layer
                            
                                Why does domain driven design seem only popular with static languages like C# & Java? [closed]
                            
                                Map of all points below a certain time of travel?
                            
                                Why do int keys of a python dict turn into strings when using json.dumps?
                            
                                How to use Gensim doc2vec with pre-trained word vectors?
                            
                                Flask Dynamic data update without reload page
                            
                                Python and MySQL
                            
                                Python code-folding in emacs?
                            
                                Matplotlib returning a plot object
                            
                                How do I properly override __setattr__ and __getattribute__ on new-style classes in Python?
                            
                                what is the difference for python between lambda and regular function?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Efficient creation of numpy arrays from list comprehension and in general

Tags:

performance

python

numpy

NielsGM

People also ask

3 Answers

Snakes and Coffee

wim

Simply Beautiful Art

Recent Activity

Donate For Us