Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

efficient python array to numpy array conversion

Tags:

python

numpy

I get a big array (image with 12 Mpix) in the array format from the python standard lib. Since I want to perform operations on those array, I wish to convert it to a numpy array. I tried the following:

import numpy import array from datetime import datetime test = array.array('d', [0]*12000000) t = datetime.now() numpy.array(test) print datetime.now() - t 

I get a result between one or two seconds: equivalent to a loop in python.

Is there a more efficient way of doing this conversion?

like image 460
Simon Bergot Avatar asked Apr 15 '11 09:04

Simon Bergot


People also ask

Which is more efficient a Python list or a NumPy array?

Because the Numpy array is densely packed in memory due to its homogeneous type, it also frees the memory faster. So overall a task executed in Numpy is around 5 to 100 times faster than the standard python list, which is a significant leap in terms of speed.

Is appending to NumPy array faster than list?

array(a) . List append is faster than array append .

Are NumPy arrays more efficient?

NumPy Arrays are faster than Python Lists because of the following reasons: An array is a collection of homogeneous data-types that are stored in contiguous memory locations. On the other hand, a list in Python is a collection of heterogeneous data types stored in non-contiguous memory locations.


2 Answers

np.array(test)                                       # 1.19s  np.fromiter(test, dtype=int)                         # 1.08s  np.frombuffer(test)                                  # 459ns !!! 
like image 146
eumiro Avatar answered Sep 27 '22 22:09

eumiro


asarray(x) is almost always the best choice for any array-like object.

array and fromiter are slow because they perform a copy. Using asarray allows this copy to be elided:

>>> import array >>> import numpy as np >>> test = array.array('d', [0]*12000000) 
# very slow - this makes multiple copies that grow each time >>> %timeit np.fromiter(test, dtype=test.typecode) 626 ms ± 3.97 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)  # fast memory copy >>> %timeit np.array(test) 63.5 ms ± 639 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)  # which is equivalent to doing the fast construction followed by a copy >>> %timeit np.asarray(test).copy() 63.4 ms ± 371 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)  # so doing just the construction is way faster >>> %timeit np.asarray(test) 1.73 µs ± 70.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)  # marginally faster, but at the expense of verbosity and type safety if you # get the wrong type >>> %timeit np.frombuffer(test, dtype=test.typecode) 1.07 µs ± 27.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)  
like image 44
Eric Avatar answered Sep 27 '22 20:09

Eric