I'm looking for a method in Python and/or Numpy vectorization to eliminate use of a for-loop for the following:
for i in list_range_values:
v[list_list_values[i]] += list_comp_values[i]
where:
list_range_values is a Python list of integer values eg. [1, 3, 5], drawn from the range(0, R-1, 1)
list_comp_values is a Python list of numeric values eg. [0.7, 9.8, 1.2, 5, 10, 11.7, 6, 0.2] such that len(list_comp_values) = R
v is a numpy vector of length V such that R can be <, =, > than V
list_list_values is a Python list of lists (each list containing a different number of integer values eg. [[3, 6, 7], [5, 7, 11, 25, 99], [8, 45], [4, 7, 8], [0, 1], [21, 31, 41], [9, 11, 22, 33, 44], [17, 19]]) drawn from the range(0, V-1, 1) and with len(list_list_values) = R
Eg.
for i in list_range_values (= [1, 3, 5]):
i=1: v[[5, 7, 11, 25, 99]] += list_comp_values[1] (= 9.8)
i=3: v[[4, 7, 8]] += list_comp_values[3] (= 5)
i=5: v[[21, 31, 41]] += list_comp_values[5] (= 11.7)
Is there a method available that allows eliminating the for-loop?
Cython, Scipy/Weave/Blitz and a C module are alternative solutions but want to make sure if there is a Numpy vectorization answer first.
Python provides two keywords that terminate a loop iteration prematurely: The Python break statement immediately terminates a loop entirely. Program execution proceeds to the first statement following the loop body. The Python continue statement immediately terminates the current loop iteration.
NumPy is fast because it can do all its calculations without calling back into Python. Since this function involves looping in Python, we lose all the performance benefits of using NumPy. For a 10,000,000-entry NumPy array, this functions takes 2.5 seconds to run on my computer.
The purpose the break statement is to break out of a loop early. For example if the following code asks a use input a integer number x. If x is divisible by 5, the break statement is executed and this causes the exit from the loop.
A for loop is used for iterating over a sequence (that is either a list, a tuple, a dictionary, a set, or a string). This is less like the for keyword in other programming languages, and works more like an iterator method as found in other object-orientated programming languages.
While it often results in a massive speedup to eliminate for loops and take advantage of numpy built-ins/vectorization. I would just point out that it is not always the case. Timing the simple for loop vs the much more involved vectorization, does not give you a large speedup and is much more verbose. Just something to consider:
from timeit import Timer
setstr="""import numpy as np
import itertools
import random
Nlists = 1000
total_lists = 5000
outsz = 100
maxsublistsz = 100
# create random list of lists
list_range_values = random.sample(xrange(total_lists),Nlists)
list_list_values = [random.sample(xrange(outsz),np.random.randint(1,maxsublistsz)) for k in xrange(total_lists)]
list_comp_values = 10*np.random.uniform(size=(total_lists,))
v = np.zeros((outsz,))
def indices(start, end):
lens = end - start
np.cumsum(lens, out=lens)
i = np.ones(lens[-1], dtype=int)
i[0] = start[0]
i[lens[:-1]] += start[1:]
i[lens[:-1]] -= end[:-1]
np.cumsum(i, out=i)
return i
def sum_by_group(values, groups):
order = np.argsort(groups)
groups = groups[order]
values = values[order]
values.cumsum(out=values)
index = np.ones(len(groups), 'bool')
index[:-1] = groups[1:] != groups[:-1]
values = values[index]
groups = groups[index]
values[1:] = np.diff(values)
return values, groups
"""
method1="""
list_list_lens = np.array(map(len, list_list_values))
comp_vals_expanded = np.repeat(list_comp_values, list_list_lens)
list_vals_flat = np.fromiter(itertools.chain.from_iterable(list_list_values),dtype=int)
list_list_starts = np.concatenate(([0], np.cumsum(list_list_lens)[:-1]))
toadd = indices(list_list_starts[list_range_values],(list_list_starts + list_list_lens)[list_range_values])
v[list_vals_flat[toadd]] += comp_vals_expanded[toadd]
"""
method2="""
for k in list_range_values:
v[list_list_values[k]] += list_comp_values[k]
"""
method3="""
llv = [list_list_values[i] for i in list_range_values]
lcv = [list_comp_values[i] for i in list_range_values]
counts = map(len, llv)
indices = np.concatenate(llv)
values = np.repeat(lcv, counts)
totals, indices_unique = sum_by_group(values, indices)
v[indices_unique] += totals
"""
t1 = Timer(method1,setup=setstr).timeit(100)
print t1
t2 = Timer(method2,setup=setstr).timeit(100)
print t2
t3 = Timer(method3,setup=setstr).timeit(100)
print t3
For a pretty large number of elements in the list:
Method1: (no for loop -jterrace) 1.43 seconds
Method2: (for loop) 4.62 seconds
Method3: (no for loop - bago) 2.99 seconds
For a small number of lists (change Nlists
to 10), the for loop is significantly faster than jterrace's solution:
Method1: (no for loop -jterrace) 1.05 seconds
Method2: (for loop) 0.045 seconds
Method3: (no for loop - bago) 0.041 seconds
This is not to knock @jterrace or @bago's solutions, which are quite elegant. Rather it is to point out that often times a simple for loop does not perform that poorly.
Using your example input:
>>> list_list_values = [[3, 6, 7], [5, 7, 11, 25, 99], [8, 45], [4, 7, 8],
[0, 1], [21, 31, 41], [9, 11, 22, 33, 44], [17, 19]]
>>> list_comp_values = [0.7, 9.8, 1.2, 5, 10, 11.7, 6, 0.2]
>>> list_range_values = [1, 3, 5]
First, some generator shenanigans:
>>> indices_weights = ((list_list_values[i], list_comp_values[i])
for i in list_range_values)
>>> flat_indices_weights = ((i, weight) for indices, weight in indices_weights
for i in indices)
Now we pass the data to numpy
. I can't figure out how to produce a rec.array
from an iterator, so I had to convert the above generator to a list. Maybe there's a way to avoid that...
>>> i_w = numpy.rec.array(list(flat_indices_weights),
dtype=[('i', int), ('weight', float)])
>>> numpy.histogram(i_w['i'], bins=range(0, 100), weights=i_w['weight'])
(array([ 0. , 0. , 0. , 0. , 5. , 9.8, 0. , 14.8, 5. ,
0. , 0. , 9.8, 0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 11.7, 0. , 0. , 0. , 9.8, 0. ,
0. , 0. , 0. , 0. , 11.7, 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 11.7, 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 9.8]),
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67,
68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]))
I had a moment to follow up on JoshAdel's tests with a couple of my own. The fastest solution so far uses Bago's set-up but replaces the sum_by_group
function with the built-in histogram
function. Here are the numbers I got (updated):
Method1 (jterrace) : 2.65
Method2 (for loop) : 2.25
Method3 (Bago) : 1.14
Method4 (histogram) : 2.82
Method5 (3/4 combo) : 1.07
Note that as implemented here, the first method gives incorrect results according to my test. I didn't have the time to figure out what the problem is. The code for my test is below; it only gently adjusts JoshAdel's original code, but I post it here in full for convenience. (Updated to include Bago's comments and somewhat de-kludged.)
from timeit import Timer
setstr="""import numpy as np
import itertools
import random
Nlists = 1000
total_lists = 5000
outsz = 100
maxsublistsz = 100
# create random list of lists
list_range_values = random.sample(xrange(total_lists),Nlists)
list_list_values = [random.sample(xrange(outsz),np.random.randint(1,maxsublistsz)) for k in xrange(total_lists)]
list_comp_values = list(10*np.random.uniform(size=(total_lists,)))
v = np.zeros((outsz,))
def indices(start, end):
lens = end - start
np.cumsum(lens, out=lens)
i = np.ones(lens[-1], dtype=int)
i[0] = start[0]
i[lens[:-1]] += start[1:]
i[lens[:-1]] -= end[:-1]
np.cumsum(i, out=i)
return i
def sum_by_group(values, groups):
order = np.argsort(groups)
groups = groups[order]
values = values[order]
values.cumsum(out=values)
index = np.ones(len(groups), 'bool')
index[:-1] = groups[1:] != groups[:-1]
values = values[index]
groups = groups[index]
values[1:] = np.diff(values)
return values, groups
"""
setstr_test = setstr + "\nprint_v = True\n"
method1="""
list_list_lens = np.array(map(len, list_list_values))
comp_vals_expanded = np.repeat(list_comp_values, list_list_lens)
list_vals_flat = np.fromiter(itertools.chain.from_iterable(list_list_values),dtype=int)
list_list_starts = np.concatenate(([0], np.cumsum(list_list_lens)[:-1]))
toadd = indices(list_list_starts[list_range_values],(list_list_starts + list_list_lens)[list_range_values])
v[list_vals_flat[toadd]] += comp_vals_expanded[toadd]
"""
method2="""
for k in list_range_values:
v[list_list_values[k]] += list_comp_values[k]
"""
method3="""
llv = [np.fromiter(list_list_values[i], 'int') for i in list_range_values]
lcv = [list_comp_values[i] for i in list_range_values]
counts = map(len, llv)
indices = np.concatenate(llv)
values = np.repeat(lcv, counts)
totals, indices_unique = sum_by_group(values, indices)
v[indices_unique] += totals
"""
method4="""
indices_weights = ((list_list_values[i], list_comp_values[i]) for i in list_range_values)
flat_indices_weights = ((i, weight) for indices, weight in indices_weights for i in indices)
i_w = np.rec.array(list(flat_indices_weights), dtype=[('i', 'i'), ('weight', 'd')])
v += np.histogram(i_w['i'], bins=range(0, outsz + 1), weights=i_w['weight'], new=True)[0]
"""
method5="""
llv = [np.fromiter(list_list_values[i], 'int') for i in list_range_values]
lcv = [list_comp_values[i] for i in list_range_values]
counts = map(len, llv)
indices = np.concatenate(llv)
values = np.repeat(lcv, counts)
v += np.histogram(indices, bins=range(0, outsz + 1), weights=values, new=True)[0]
"""
t1 = Timer(method1,setup=setstr).timeit(100)
print t1
t2 = Timer(method2,setup=setstr).timeit(100)
print t2
t3 = Timer(method3,setup=setstr).timeit(100)
print t3
t4 = Timer(method4,setup=setstr).timeit(100)
print t4
t5 = Timer(method5,setup=setstr).timeit(100)
print t5
exec(setstr_test + method1 + "\nprint v\n")
exec("\nv = np.zeros((outsz,))\n" + method2 + "\nprint v\n")
exec("\nv = np.zeros((outsz,))\n" + method3 + "\nprint v\n")
exec("\nv = np.zeros((outsz,))\n" + method4 + "\nprint v\n")
exec("\nv = np.zeros((outsz,))\n" + method5 + "\nprint v\n")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With