Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pure Python faster than Numpy? can I make this numpy code faster?

Tags:

python

numpy

I need to compute the min, max, and mean from a specific list of faces/vertices. I tried to optimize this computing with the use of Numpy but without success.

Here is my test case:

#!/usr/bin/python
# -*- coding: iso-8859-15 -*-
'''
Module Started 22 févr. 2013
@note: test case comparaison numpy vs python
@author: Python4D/damien
'''

import numpy as np
import time


def Fnumpy(vertices):
  np_vertices=np.array(vertices)
  _x=np_vertices[:,:,0]
  _y=np_vertices[:,:,1]
  _z=np_vertices[:,:,2]
  _min=[np.min(_x),np.min(_y),np.min(_z)]
  _max=[np.max(_x),np.max(_y),np.max(_z)]
  _mean=[np.mean(_x),np.mean(_y),np.mean(_z)]
  return _mean,_max,_min

def Fpython(vertices):
  list_x=[item[0] for sublist in vertices for item in sublist]
  list_y=[item[1] for sublist in vertices for item in sublist]
  list_z=[item[2] for sublist in vertices for item in sublist]
  taille=len(list_x)
  _mean=[sum(list_x)/taille,sum(list_y)/taille,sum(list_z)/taille]
  _max=[max(list_x),max(list_y),max(list_z)]
  _min=[min(list_x),min(list_y),min(list_z)]    
  return _mean,_max,_min

if __name__=="__main__":
  vertices=[[[1.1,2.2,3.3,4.4]]*4]*1000000
  _t=time.clock()
  print ">>NUMPY >>{} for {}s.".format(Fnumpy(vertices),time.clock()-_t)
  _t=time.clock()
  print ">>PYTHON>>{} for {}s.".format(Fpython(vertices),time.clock()-_t)

The results are:

Numpy:

([1.1000000000452519, 2.2000000000905038, 3.3000000001880174], [1.1000000000000001, 2.2000000000000002, 3.2999999999999998], [1.1000000000000001, 2.2000000000000002, 3.2999999999999998]) for 27.327068618s.

Python:

([1.100000000045252, 2.200000000090504, 3.3000000001880174], [1.1, 2.2, 3.3], [1.1, 2.2, 3.3]) for 1.81366938593s.

Pure Python is 15x faster than Numpy!

like image 317
baco Avatar asked Dec 16 '22 13:12

baco


1 Answers

The reason your Fnumpy is slower is that it contains an additional step not done by Fpython: the creation of a numpy array in memory. If you move the line np_verticies=np.array(verticies) outside of Fnumpy and the timed section your results will be very different:

>>NUMPY >>([1.1000000000452519, 2.2000000000905038, 3.3000000001880174], [1.1000000000000001, 2.2000000000000002, 3.2999999999999998], [1.1000000000000001, 2.2000000000000002, 3.2999999999999998]) for 0.500802s.
>>PYTHON>>([1.100000000045252, 2.200000000090504, 3.3000000001880174], [1.1, 2.2, 3.3], [1.1, 2.2, 3.3]) for 2.182239s.

You can also speed up the allocation step significantly by providing a datatype hint to numpy when you create it. If you tell Numpy you have an array of floats, then even if you leave the np.array() call in the timing loop it will beat the pure python version.

If I change np_vertices=np.array(vertices) to np_vertices=np.array(vertices, dtype=np.float_) and keep it in Fnumpy, the Fnumpy version will beat Fpython even though it has to do a lot more work:

>>NUMPY >>([1.1000000000452519, 2.2000000000905038, 3.3000000001880174], [1.1000000000000001, 2.2000000000000002, 3.2999999999999998], [1.1000000000000001, 2.2000000000000002, 3.2999999999999998]) for 1.586066s.
>>PYTHON>>([1.100000000045252, 2.200000000090504, 3.3000000001880174], [1.1, 2.2, 3.3], [1.1, 2.2, 3.3]) for 2.196787s.
like image 67
Francis Avila Avatar answered Dec 21 '22 11:12

Francis Avila