I found that a bottleneck in my program is the creation of numpy arrays from a list of given values, most commonly putting four values into a 2x2 array. There is an obvious, easy-to-read way to do it:
my_array = numpy.array([[1, 3], [2.4, -1]])
which takes 15 us -- very very slow since I'm doing it millions of times.
Then there is a far faster, hard-to-read way:
my_array = numpy.empty((2,2))
my_array[0,0] = 1
my_array[0,1] = 3
my_array[1,0] = 2.4
my_array[1,1] = -1
This is 10 times faster, at just 1 us.
Is there any method that is BOTH fast AND easy-to-read?
What I tried so far: Using asarray
instead of array
makes no difference; passing dtype=float
into array
also makes no difference. Finally, I understand that I can do it myself:
def make_array_from_list(the_list, num_rows, num_cols):
the_array = np.empty((num_rows, num_cols))
for i in range(num_rows):
for j in range(num_cols):
the_array[i,j] = the_list[i][j]
return the_array
This will create the array in 4us, which is medium readability at medium speed (compared to the two approaches above). But really, I cannot believe that there is not a better approach using built-in methods.
Thank you in advance!!
There are three different ways to create Numpy arrays: Using Numpy functions. Conversion from other Python structures like lists. Using special library functions.
NumPy Arrays are faster than Python Lists because of the following reasons: An array is a collection of homogeneous data-types that are stored in contiguous memory locations. On the other hand, a list in Python is a collection of heterogeneous data types stored in non-contiguous memory locations.
The following code multiplies each element of an array with a corresponding element in another array. Finally, we sum up all the individual products. Once again, the NumPy version was about 100 times faster than iterating over a list.
By explicitly declaring the "ndarray" data type, your array processing can be 1250x faster. This tutorial will show you how to speed up the processing of NumPy arrays using Cython. By explicitly specifying the data types of variables in Python, Cython can give drastic speed increases at runtime.
This is a great question. I can't find anything which will approach the speed of your completely unrolled solution (edit @BiRico was able to come up with something close. See comments and update :). Here are a bunch of different options that I (and others) came up with and associated timings:
import numpy as np
def f1():
"np.array + nested lists"
my_array = np.array([[1, 3], [2.4, -1]])
def f2():
"np.array + nested tuples"
my_array = np.array(((1, 3), (2.4, -1)))
def f3():
"Completely unrolled"
my_array = np.empty((2,2),dtype=float)
my_array[0,0] = 1
my_array[0,1] = 3
my_array[1,0] = 2.4
my_array[1,1] = -1
def f4():
"empty + ravel + list"
my_array = np.empty((2,2),dtype=float)
my_array.ravel()[:] = [1,3,2.4,-1]
def f5():
"empty + ravel + tuple"
my_array = np.empty((2,2),dtype=float)
my_array.ravel()[:] = (1,3,2.4,-1)
def f6():
"empty + slice assignment"
my_array = np.empty((2,2),dtype=float)
my_array[0,:] = (1,3)
my_array[1,:] = (2.4,-1)
def f7():
"empty + index assignment"
my_array = np.empty((2,2),dtype=float)
my_array[0] = (1,3)
my_array[1] = (2.4,-1)
def f8():
"np.array + flat list + reshape"
my_array = np.array([1, 3, 2.4, -1]).reshape((2,2))
def f9():
"np.empty + ndarray.flat (Pierre GM)"
my_array = np.empty((2,2), dtype=float)
my_array.flat = (1,3,2.4,-1)
def f10():
"np.fromiter (Bi Roco)"
my_array = np.fromiter((1,3,2.4,-1), dtype=float).reshape((2,2))
import timeit
results = {}
for i in range(1,11):
func_name = 'f%d'%i
my_import = 'from __main__ import %s'%func_name
func_doc = globals()[func_name].__doc__
results[func_name] = (timeit.timeit(func_name+'()',
my_import,
number=100000),
'\t'.join((func_name,func_doc)))
for result in sorted(results.values()):
print '\t'.join(map(str,result))
And the important timings:
On Ubuntu Linux, Core i7:
0.158674955368 f3 Completely unrolled
0.225094795227 f10 np.fromiter (Bi Roco)
0.737828969955 f8 np.array + flat list + reshape
0.782918930054 f5 empty + ravel + tuple
0.786983013153 f9 np.empty + ndarray.flat (Pierre GM)
0.814703941345 f4 empty + ravel + list
1.2375421524 f7 empty + index assignment
1.32230591774 f2 np.array + nested tuples
1.3752617836 f6 empty + slice assignment
1.39459013939 f1 np.array + nested lists
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With