Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimizing Python for loop

I have a loop that is my biggest time suck for a particular function and I'd like to speed it up. Current, this single loop takes up about 400ms, while the execution for the rest of the function takes about 610ms.

The code is:

for ctr in xrange(N):
    list1[ctr] = in1[ctr] - in1[0] - ctr * c1
    list2[ctr] = in2[ctr] - in2[0] - ctr * c2
    list3[ctr] = c3 - in1[ctr]
    list4[ctr] = c4 - in2[ctr]

N can be anywhere from around 40,000 to 120,000, and is the length of all lists (in1, in2, listN) shown.

Does anyone know some Python tricks to speed this up? I already tried using map instead, as I know it tries to compile to more efficient code, but it was about 250ms slower.

Thanks

like image 750
bheklilr Avatar asked Dec 13 '22 10:12

bheklilr


2 Answers

Assuming that list1, list2, etc, all are numerical, consider using numpy arrays instead of lists. For large sequences of integers or floats you'll see a huge speedup.

If you go that route, your loop above could be written like this:

ctr = np.arange(N)
list1 = n1 - n1[0] - ctr * c1
list2 = n2 - n2[0] - ctr * c2
list3 = c3 - ctr
list4 = c4 - ctr

And as a full stand-alone example for timing:

import numpy as np
N = 100000

# Generate some random data...
n1 = np.random.random(N)
n2 = np.random.random(N)
c1, c2, c3, c4 = np.random.random(4)

ctr = np.arange(N)
list1 = n1 - n1[0] - ctr * c1
list2 = n2 - n2[0] - ctr * c2
list3 = c3 - ctr
list4 = c4 - ctr

Of course, if your list1, list2, etc are non-numerical (i.e. lists of python objects other than floats or ints), then this won't help.

like image 184
Joe Kington Avatar answered Jan 02 '23 03:01

Joe Kington


There was a bit of a mistake originally (see below) These are more properly cached.

# These can be cached as they do not change.
base_in1 = in1[0]
base_in2 = in2[0]
for ctr in xrange(N):
    # these are being looked up several times. Look-ups take time in almost every
    # language. Look them up once and then use the new value.
    cin1 = in1[ctr]
    cin2 = in2[ctr]
    list1[ctr] = cin1 - base_in1 - ctr * c1
    list2[ctr] = cin2 - base_in2 - ctr * c2
    list3[ctr] = c3 - cin1
    list4[ctr] = c4 - cin2

(Mistake below):

Originally I thought that this could be solved by caching constants:

# these values never change
ctr1 = ctr * c1
ctr2 = ctr * c2
in10 = ctr1 + in1[0]
in20 = ctr2 + in2[0]
for ctr in xrange(N):
    # these are being looked up several times. That costs time.
    # look them up once and then use the new value.
    cin1 = in1[ctr]
    cin2 = in2[ctr]
    list1[ctr] = cin1 - in10
    list2[ctr] = cin2 - in20
    list3[ctr] = c3 - cin1
    list4[ctr] = c4 - cin2

But as Tim pointed out, I had missed the ctr in my original attempt.

like image 40
cwallenpoole Avatar answered Jan 02 '23 03:01

cwallenpoole