I found that a bottleneck in my program is the creation of numpy arrays from a list of given values, most commonly putting four values into a 2x2 array. There is an obvious, easy-to-read way to do it:
my_array = numpy.array([[1, 3], [2.4, -1]])
which takes 15 us -- very very slow since I'm doing it millions of times.
Then there is a far faster, hard-to-read way:
my_array = numpy.empty((2,2))
my_array[0,0] = 1
my_array[0,1] = 3
my_array[1,0] = 2.4
my_array[1,1] = -1
This is 10 times faster, at just 1 us.
Is there any method that is BOTH fast AND easy-to-read?
What I tried so far: Using asarray instead of array makes no difference; passing dtype=float into array also makes no difference. Finally, I understand that I can do it myself:
def make_array_from_list(the_list, num_rows, num_cols):
    the_array = np.empty((num_rows, num_cols))
    for i in range(num_rows):
        for j in range(num_cols):
            the_array[i,j] = the_list[i][j]
    return the_array
This will create the array in 4us, which is medium readability at medium speed (compared to the two approaches above). But really, I cannot believe that there is not a better approach using built-in methods.
Thank you in advance!!
There are three different ways to create Numpy arrays: Using Numpy functions. Conversion from other Python structures like lists. Using special library functions.
NumPy Arrays are faster than Python Lists because of the following reasons: An array is a collection of homogeneous data-types that are stored in contiguous memory locations. On the other hand, a list in Python is a collection of heterogeneous data types stored in non-contiguous memory locations.
The following code multiplies each element of an array with a corresponding element in another array. Finally, we sum up all the individual products. Once again, the NumPy version was about 100 times faster than iterating over a list.
By explicitly declaring the "ndarray" data type, your array processing can be 1250x faster. This tutorial will show you how to speed up the processing of NumPy arrays using Cython. By explicitly specifying the data types of variables in Python, Cython can give drastic speed increases at runtime.
This is a great question. I can't find anything which will approach the speed of your completely unrolled solution (edit @BiRico was able to come up with something close. See comments and update :). Here are a bunch of different options that I (and others) came up with and associated timings:
import numpy as np
def f1():
    "np.array + nested lists"
    my_array = np.array([[1, 3], [2.4, -1]])
def f2():
    "np.array + nested tuples"
    my_array = np.array(((1, 3), (2.4, -1)))
def f3():
    "Completely unrolled"
    my_array = np.empty((2,2),dtype=float)
    my_array[0,0] = 1
    my_array[0,1] = 3
    my_array[1,0] = 2.4
    my_array[1,1] = -1
def f4():
    "empty + ravel + list"
    my_array = np.empty((2,2),dtype=float)
    my_array.ravel()[:] = [1,3,2.4,-1]
def f5():
    "empty + ravel + tuple"
    my_array = np.empty((2,2),dtype=float)
    my_array.ravel()[:] = (1,3,2.4,-1)
def f6():
    "empty + slice assignment"
    my_array = np.empty((2,2),dtype=float)
    my_array[0,:] = (1,3)
    my_array[1,:] = (2.4,-1)
def f7():
    "empty + index assignment"
    my_array = np.empty((2,2),dtype=float)
    my_array[0] = (1,3)
    my_array[1] = (2.4,-1)
def f8():
    "np.array + flat list + reshape"
    my_array = np.array([1, 3, 2.4, -1]).reshape((2,2))
def f9():
    "np.empty + ndarray.flat  (Pierre GM)"
    my_array = np.empty((2,2), dtype=float)
    my_array.flat = (1,3,2.4,-1)
def f10():
    "np.fromiter (Bi Roco)"
    my_array = np.fromiter((1,3,2.4,-1), dtype=float).reshape((2,2))
import timeit
results = {}
for i in range(1,11):
    func_name = 'f%d'%i
    my_import = 'from __main__ import %s'%func_name
    func_doc = globals()[func_name].__doc__
    results[func_name] = (timeit.timeit(func_name+'()',
                                        my_import,
                                        number=100000),
                          '\t'.join((func_name,func_doc)))
for result in sorted(results.values()):
    print '\t'.join(map(str,result))
And the important timings:
On Ubuntu Linux, Core i7:
0.158674955368  f3  Completely unrolled
0.225094795227  f10 np.fromiter (Bi Roco)
0.737828969955  f8  np.array + flat list + reshape
0.782918930054  f5  empty + ravel + tuple
0.786983013153  f9  np.empty + ndarray.flat  (Pierre GM)
0.814703941345  f4  empty + ravel + list
1.2375421524    f7  empty + index assignment
1.32230591774   f2  np.array + nested tuples
1.3752617836    f6  empty + slice assignment
1.39459013939   f1  np.array + nested lists
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With