Optimized dot product in Python

Tags:

The dot product of two n-dimensional vectors u=[u1,u2,...un] and v=[v1,v2,...,vn] is is given by u1*v1 + u2*v2 + ... + un*vn.

A question posted yesterday encouraged me to find the fastest way to compute dot products in Python using only the standard library, no third-party modules or C/Fortran/C++ calls.

I timed four different approaches; so far the fastest seems to be sum(starmap(mul,izip(v1,v2))) (where starmap and izip come from the itertools module).

For the code presented below, these are the elapsed times (in seconds, for one million runs):

d0: 12.01215
d1: 11.76151
d2: 12.54092
d3: 09.58523

Can you think of a faster way to do this?

import timeit # module with timing subroutines                                                              
import random # module to generate random numnbers                                                          
from itertools import imap,starmap,izip
from operator import mul

def v(N=50,min=-10,max=10):
    """Generates a random vector (in an array) of dimension N; the                                          
    values are integers in the range [min,max]."""
    out = []
    for k in range(N):
        out.append(random.randint(min,max))
    return out

def check(v1,v2):
    if len(v1)!=len(v2):
        raise ValueError,"the lenght of both arrays must be the same"
    pass

def d0(v1,v2):
    """                                                                                                     
    d0 is Nominal approach:                                                                                 
    multiply/add in a loop                                                                                  
    """
    check(v1,v2)
    out = 0
    for k in range(len(v1)):
        out += v1[k] * v2[k]
    return out

def d1(v1,v2):
    """                                                                                                     
    d1 uses an imap (from itertools)                                                                        
    """
    check(v1,v2)
    return sum(imap(mul,v1,v2))

def d2(v1,v2):
    """                                                                                                     
    d2 uses a conventional map                                                                              
    """
    check(v1,v2)
    return sum(map(mul,v1,v2))

def d3(v1,v2):
    """                                                                                                     
    d3 uses a starmap (itertools) to apply the mul operator on an izipped (v1,v2)                           
    """
    check(v1,v2)
    return sum(starmap(mul,izip(v1,v2)))

# generate the test vectors                                                                                 
v1 = v()
v2 = v()

if __name__ == '__main__':

    # Generate two test vectors of dimension N                                                              

    t0 = timeit.Timer("d0(v1,v2)","from dot_product import d0,v1,v2")
    t1 = timeit.Timer("d1(v1,v2)","from dot_product import d1,v1,v2")
    t2 = timeit.Timer("d2(v1,v2)","from dot_product import d2,v1,v2")
    t3 = timeit.Timer("d3(v1,v2)","from dot_product import d3,v1,v2")

    print "d0 elapsed: ", t0.timeit()
    print "d1 elapsed: ", t1.timeit()
    print "d2 elapsed: ", t2.timeit()
    print "d3 elapsed: ", t3.timeit()

Notice that the name of the file must be dot_product.py for the script to run; I used Python 2.5.1 on a Mac OS X Version 10.5.8.

EDIT:

I ran the script for N=1000 and these are the results (in seconds, for one million runs):

d0: 205.35457
d1: 208.13006
d2: 230.07463
d3: 155.29670

I guess it is safe to assume that, indeed, option three is the fastest and option two the slowest (of the four presented).

432

asked Dec 01 '09 19:12

Escualo

1 Answers

Just for fun I wrote a "d4" which uses numpy:

from numpy import dot
def d4(v1, v2): 
    check(v1, v2)
    return dot(v1, v2)

My results (Python 2.5.1, XP Pro sp3, 2GHz Core2 Duo T7200):

d0 elapsed:  12.1977242918
d1 elapsed:  13.885232341
d2 elapsed:  13.7929552499
d3 elapsed:  11.0952246724

~~d4 elapsed: 56.3278584289 # go numpy!~~

And, for even more fun, I turned on psyco:

d0 elapsed:  0.965477735299
d1 elapsed:  12.5354792299
d2 elapsed:  12.9748163524
d3 elapsed:  9.78255448667

~~d4 elapsed: 54.4599059378~~

Based on that, I declare d0 the winner :)

Update

@kaiser.se: I probably should have mentioned that I did convert everything to numpy arrays first:

from numpy import array
v3 = [array(vec) for vec in v1]
v4 = [array(vec) for vec in v2]

# then
t4 = timeit.Timer("d4(v3,v4)","from dot_product import d4,v3,v4")

And I included check(v1, v2) since it's included in the other tests. Leaving it off would give numpy an unfair advantage (though it looks like it could use one). The array conversion shaved off about a second (much less than I thought it would).

All of my tests were run with N=50.

@nikow: I'm using numpy 1.0.4, which is undoubtedly old, it's certainly possible that they've improved performance over the last year and a half since I've installed it.

Update #2

@kaiser.se Wow, you are totally right. I must have been thinking that these were lists of lists or something (I really have no idea what I was thinking ... +1 for pair programming).

How does this look:

v3 = array(v1)
v4 = array(v2)

New results:

d4 elapsed:  3.22535741274

With Psyco:

d4 elapsed:  2.09182619579

d0 still wins with Psyco, but numpy is probably better overall, especially with larger data sets.

Yesterday I was a bit bothered my slow numpy result, since presumably numpy is used for a lot of computation and has had a lot of optimization. Obviously though, not bothered enough to check my result :)

170

answered Oct 13 '22 12:10

Seth

Related questions
                            
                                Create clusters using correlation matrix in Python
                            
                                codecov unable to collect data using pytest - "Coverage.py warning: No data was collected."
                            
                                Flask send_file not sending file
                            
                                How to downgrade Python from 3.7 to 3.5 in Anaconda [closed]
                            
                                pd.read_csv add column named "Unnamed: 0
                            
                                Saving result of DataFrame show() to string in pyspark
                            
                                deadlock detected when trying to start server
                            
                                Python Logic of ListNode in Leetcode
                            
                                Applying callbacks in a custom training loop in Tensorflow 2.0
                            
                                Can we plot image data in Altair?
                            
                                How can I add python type annotations to the flask global context g?
                            
                                import dataset into google colab from another drive account
                            
                                Unable to find custom Blender operator in F3 operator search (Blender 2.9)
                            
                                TypeError: Object of type function is not JSON serializable when using flask_jwt_extended int RESTful API
                            
                                pyPdf for IndirectObject extraction
                            
                                Google AppEngine: how to count a database's entries beyond 1000?
                            
                                How do I disassemble a Python script?
                            
                                mrdivide function in MATLAB: what is it doing, and how can I do it in Python?
                            
                                How do I install PyGTK / PyGobject on Windows with Python 2.6?
                            
                                Sharing utilities modules across python projects

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Optimized dot product in Python

Tags:

python

algorithm

math

Escualo

People also ask

1 Answers

Seth

Recent Activity

Donate For Us