Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is numpy slower than python? How to make code perform better

I revrite my neural net from pure python to numpy, but now it is working even slower. So I tried this two functions:

def d():
    a = [1,2,3,4,5]
    b = [10,20,30,40,50]
    c = [i*j for i,j in zip(a,b)]
    return c

def e():
    a = np.array([1,2,3,4,5])
    b = np.array([10,20,30,40,50])
    c = a*b
    return c

timeit d = 1.77135205057

timeit e = 17.2464673758

Numpy is 10times slower. Why is it so and how to use numpy properly?

like image 993
user2173836 Avatar asked May 16 '13 20:05

user2173836


People also ask

How can I make my NumPy code faster?

By explicitly declaring the "ndarray" data type, your array processing can be 1250x faster. This tutorial will show you how to speed up the processing of NumPy arrays using Cython. By explicitly specifying the data types of variables in Python, Cython can give drastic speed increases at runtime.

Is NumPy faster than Python?

NumPy Arrays are faster than Python Lists because of the following reasons: An array is a collection of homogeneous data-types that are stored in contiguous memory locations. On the other hand, a list in Python is a collection of heterogeneous data types stored in non-contiguous memory locations.

What is NumPy and how is it better than a list in Python?

NumPy is the fundamental package for scientific computing in Python. Numpy arrays facilitate advanced mathematical and other types of operations on large numbers of data. Typically, such operations are executed more efficiently and with less code than is possible using Python's built-in sequences.

Why is NumPy advantageous than Python list?

NumPy uses much less memory to store data The NumPy arrays takes significantly less amount of memory as compared to python lists. It also provides a mechanism of specifying the data types of the contents, which allows further optimisation of the code. If this difference seems intimidating then prepare to have more.


1 Answers

I would assume that the discrepancy is because you're constructing lists and arrays in e whereas you're only constructing lists in d. Consider:

import numpy as np

def d():
    a = [1,2,3,4,5]
    b = [10,20,30,40,50]
    c = [i*j for i,j in zip(a,b)]
    return c

def e():
    a = np.array([1,2,3,4,5])
    b = np.array([10,20,30,40,50])
    c = a*b
    return c

#Warning:  Functions with mutable default arguments are below.
# This code is only for testing and would be bad practice in production!
def f(a=[1,2,3,4,5],b=[10,20,30,40,50]):
    c = [i*j for i,j in zip(a,b)]
    return c

def g(a=np.array([1,2,3,4,5]),b=np.array([10,20,30,40,50])):
    c = a*b
    return c


import timeit
print timeit.timeit('d()','from __main__ import d')
print timeit.timeit('e()','from __main__ import e')
print timeit.timeit('f()','from __main__ import f')
print timeit.timeit('g()','from __main__ import g')

Here the functions f and g avoid recreating the lists/arrays each time around and we get very similar performance:

1.53083586693
15.8963699341
1.33564996719
1.69556999207

Note that list-comp + zip still wins. However, if we make the arrays sufficiently big, numpy wins hands down:

t1 = [1,2,3,4,5] * 100
t2 = [10,20,30,40,50] * 100
t3 = np.array(t1)
t4 = np.array(t2)
print timeit.timeit('f(t1,t2)','from __main__ import f,t1,t2',number=10000)
print timeit.timeit('g(t3,t4)','from __main__ import g,t3,t4',number=10000)

My results are:

0.602419137955
0.0263929367065
like image 64
mgilson Avatar answered Jan 04 '23 21:01

mgilson