Building a small numpy array from individual values: Fast and readable method?

Tags:

I found that a bottleneck in my program is the creation of numpy arrays from a list of given values, most commonly putting four values into a 2x2 array. There is an obvious, easy-to-read way to do it:

my_array = numpy.array([[1, 3], [2.4, -1]])

which takes 15 us -- very very slow since I'm doing it millions of times.

Then there is a far faster, hard-to-read way:

my_array = numpy.empty((2,2))
my_array[0,0] = 1
my_array[0,1] = 3
my_array[1,0] = 2.4
my_array[1,1] = -1

This is 10 times faster, at just 1 us.

Is there any method that is BOTH fast AND easy-to-read?

What I tried so far: Using asarray instead of array makes no difference; passing dtype=float into array also makes no difference. Finally, I understand that I can do it myself:

def make_array_from_list(the_list, num_rows, num_cols):
    the_array = np.empty((num_rows, num_cols))
    for i in range(num_rows):
        for j in range(num_cols):
            the_array[i,j] = the_list[i][j]
    return the_array

This will create the array in 4us, which is medium readability at medium speed (compared to the two approaches above). But really, I cannot believe that there is not a better approach using built-in methods.

Thank you in advance!!

895

asked Oct 29 '12 23:10

Steve Byrnes

1 Answers

This is a great question. I can't find anything which will approach the speed of your completely unrolled solution (edit @BiRico was able to come up with something close. See comments and update :). Here are a bunch of different options that I (and others) came up with and associated timings:

import numpy as np

def f1():
    "np.array + nested lists"
    my_array = np.array([[1, 3], [2.4, -1]])

def f2():
    "np.array + nested tuples"
    my_array = np.array(((1, 3), (2.4, -1)))

def f3():
    "Completely unrolled"
    my_array = np.empty((2,2),dtype=float)
    my_array[0,0] = 1
    my_array[0,1] = 3
    my_array[1,0] = 2.4
    my_array[1,1] = -1

def f4():
    "empty + ravel + list"
    my_array = np.empty((2,2),dtype=float)
    my_array.ravel()[:] = [1,3,2.4,-1]

def f5():
    "empty + ravel + tuple"
    my_array = np.empty((2,2),dtype=float)
    my_array.ravel()[:] = (1,3,2.4,-1)

def f6():
    "empty + slice assignment"
    my_array = np.empty((2,2),dtype=float)
    my_array[0,:] = (1,3)
    my_array[1,:] = (2.4,-1)

def f7():
    "empty + index assignment"
    my_array = np.empty((2,2),dtype=float)
    my_array[0] = (1,3)
    my_array[1] = (2.4,-1)

def f8():
    "np.array + flat list + reshape"
    my_array = np.array([1, 3, 2.4, -1]).reshape((2,2))

def f9():
    "np.empty + ndarray.flat  (Pierre GM)"
    my_array = np.empty((2,2), dtype=float)
    my_array.flat = (1,3,2.4,-1)

def f10():
    "np.fromiter (Bi Roco)"
    my_array = np.fromiter((1,3,2.4,-1), dtype=float).reshape((2,2))

import timeit
results = {}
for i in range(1,11):
    func_name = 'f%d'%i
    my_import = 'from __main__ import %s'%func_name
    func_doc = globals()[func_name].__doc__
    results[func_name] = (timeit.timeit(func_name+'()',
                                        my_import,
                                        number=100000),
                          '\t'.join((func_name,func_doc)))

for result in sorted(results.values()):
    print '\t'.join(map(str,result))

And the important timings:

On Ubuntu Linux, Core i7:

0.158674955368  f3  Completely unrolled
0.225094795227  f10 np.fromiter (Bi Roco)
0.737828969955  f8  np.array + flat list + reshape
0.782918930054  f5  empty + ravel + tuple
0.786983013153  f9  np.empty + ndarray.flat  (Pierre GM)
0.814703941345  f4  empty + ravel + list
1.2375421524    f7  empty + index assignment
1.32230591774   f2  np.array + nested tuples
1.3752617836    f6  empty + slice assignment
1.39459013939   f1  np.array + nested lists

130

answered Sep 16 '22 22:09

mgilson

Related questions
                            
                                How do I configure PyMySQL connect for SSL?
                            
                                Is there any elegant way to define a dataframe with column of dtype array?
                            
                                Display interactive plotly chart (.html file) on GitHub Pages
                            
                                trying to install numpy in python3.9 and getting error in preparing wheel metadata in windows 10. I did not checked using virtual environment [duplicate]
                            
                                What Python bindings are there for CVS or SVN?
                            
                                Better resources to learn buildout
                            
                                Calling Py_Finalize() from C
                            
                                python 2.7 vs python 3.1
                            
                                Can I retrieve IMDb's movie recommendations for a given movie using IMDbPY?
                            
                                Creation of a simple HTML file upload page
                            
                                How to notify myself when a python script runs into an error or just stops?
                            
                                Python Inheritance : Return subclass
                            
                                Casting from base Model instance to derived proxy Model in Django?
                            
                                Constrained least-squares estimation in Python
                            
                                Can't get pyparsing Dict() to return nested dictionary
                            
                                PIP install and Python path
                            
                                Can executables made with py2app include other terminal scripts and run them?
                            
                                Django: Using Annotate, Count and Distinct on a Queryset
                            
                                Which features are monkey patched by gunicorn gevent worker?
                            
                                Python - User-defined classes have __cmp__() and __hash__() methods by default? Or?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Building a small numpy array from individual values: Fast and readable method?

Tags:

python

numpy

Steve Byrnes

People also ask

1 Answers

mgilson

Recent Activity

Donate For Us