Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Poor numpy.cross() performance

I've been doing some performance testing in order to improve the performance of a pet project I'm writing. It's a very number-crunching intensive application, so I've been playing with Numpy as a way of improving computational performance.

However, the result from the following performance tests were quite surprising....

Test Source Code (Updated with test cases for hoisting and batch submission)

import timeit

numpySetup = """
import numpy
left = numpy.array([1.0,0.0,0.0])
right = numpy.array([0.0,1.0,0.0])
"""

hoistSetup = numpySetup +'hoist = numpy.cross\n'

pythonSetup = """
left = [1.0,0.0,0.0]
right = [0.0,1.0,0.0]
"""

numpyBatchSetup = """
import numpy

l = numpy.array([1.0,0.0,0.0])
left = numpy.array([l]*10000)

r = numpy.array([0.0,1.0,0.0])
right = numpy.array([r]*10000)
"""

pythonCrossCode = """
x = ((left[1] * right[2]) - (left[2] * right[1]))
y = ((left[2] * right[0]) - (left[0] * right[2]))
z = ((left[0] * right[1]) - (left[1] * right[0]))
"""

pythonCross = timeit.Timer(pythonCrossCode, pythonSetup)
numpyCross = timeit.Timer ('numpy.cross(left, right)' , numpySetup)
hybridCross = timeit.Timer(pythonCrossCode, numpySetup)
hoistCross = timeit.Timer('hoist(left, right)', hoistSetup)
batchCross = timeit.Timer('numpy.cross(left, right)', numpyBatchSetup) 

print 'Python Cross Product : %4.6f ' % pythonCross.timeit(1000000)
print 'Numpy Cross Product  : %4.6f ' % numpyCross.timeit(1000000) 
print 'Hybrid Cross Product : %4.6f ' % hybridCross.timeit(1000000) 
print 'Hoist Cross Product  : %4.6f ' % hoistCross.timeit(1000000) 
# 100 batches of 10000 each is equivalent to 1000000
print 'Batch Cross Product  : %4.6f ' % batchCross.timeit(100) 

Original Results

Python Cross Product : 0.754945 
Numpy Cross Product  : 20.752983 
Hybrid Cross Product : 4.467417 

Final Results

Python Cross Product : 0.894334 
Numpy Cross Product  : 21.099040 
Hybrid Cross Product : 4.467194 
Hoist Cross Product  : 20.896225 
Batch Cross Product  : 0.262964 

Needless to say, this wasn't the result I expected. The pure Python version performs almost 30x faster than Numpy. Numpy performance in other tests has been better than the Python equivalent (which was the expected result).

So, I've got two related questions:

  • Can anyone explain why NumPy is performing so poorly in this case?
  • Is there something I can do to fix it?
like image 357
Adam Luchjenbroers Avatar asked Jan 01 '10 07:01

Adam Luchjenbroers


People also ask

How can I make NumPy run faster?

By explicitly declaring the "ndarray" data type, your array processing can be 1250x faster. This tutorial will show you how to speed up the processing of NumPy arrays using Cython. By explicitly specifying the data types of variables in Python, Cython can give drastic speed increases at runtime.

Is Numba better than NumPy?

Large dataFor larger input data, Numba version of function is must faster than Numpy version, even taking into account of the compiling time. In fact, the ratio of the Numpy and Numba run time will depends on both datasize, and the number of loops, or more general the nature of the function (to be compiled).

Are NumPy arrays slow?

The reason why NumPy is fast when used right is that its arrays are extremely efficient. They are like C arrays instead of Python lists.


1 Answers

Try this with larger arrays. I think that just the cost of calling the methods of numpy here overruns the simple several list accesses required by the Python version. If you deal with larger arrays, I think you'll see large wins for numpy.

like image 127
Eli Bendersky Avatar answered Sep 25 '22 08:09

Eli Bendersky