I am trying to use Python's itertools
module to speed up a triple nested for loop. The test code below compares a standard triple nested loop with itertools' product method and outputs:
Nested loop time = 2.35023 secs
Itertools loop time = 2.67766 secs
Am I missing something?
import numpy
import itertools
import time
n = 128
a = numpy.arange(n**3).reshape((n,n,n))
b = numpy.zeros((n,n,n))
c = numpy.zeros((n,n,n))
t = time.time()
for i in range(n):
for j in range(n):
for k in range(n):
b[i,j,k] = a[i,j,k]
print 'Nested loop time = %g secs' % (time.time() - t)
t = time.time()
for (i,j,k) in itertools.product(range(n), repeat=3):
c[i,j,k] = a[i,j,k]
print 'Itertools loop time = %g secs' % (time.time() - t)
It does seem like itertools.product
is slower for large values of n
:
In [24]: print _23 from itertools import product def nested_loops(n): for i in range(n): for j in range(n): for k in range(n): pass def itertools_product(n): for (i,j,k) in product(range(n), repeat=3): pass In [25]: %timeit nested_loops(128) 10 loops, best of 3: 68.6 ms per loop In [26]: %timeit itertools_product(128) 10 loops, best of 3: 162 ms per loop In [27]: %timeit nested_loops(10) 10000 loops, best of 3: 84.5 us per loop In [28]: %timeit itertools_product(10) 10000 loops, best of 3: 79.8 us per loop In [30]: %timeit nested_loops(300) 1 loops, best of 3: 833 ms per loop In [31]: %timeit itertools_product(300) 1 loops, best of 3: 2.07 s per loop
And without the tuple unpacking:
In [40]: print _39 from itertools import product def itertools_product(n): for ijk in product(range(n), repeat=3): pass In [41]: %timeit itertools_product(128) 10 loops, best of 3: 115 ms per loop In [42]: %timeit itertools_product(10) 10000 loops, best of 3: 59.2 us per loop In [43]: %timeit itertools_product(300) 1 loops, best of 3: 1.47 s per loop
Also, for fun, list comprehensions and generator expressions:
def list_comprehension_product(n): range_n = range(n) for (i,j,k) in [ (i, j, k) for i in range_n for j in range_n for k in range_n ]: pass def generator_expression_product(n): range_n = range(n) for (i,j,k) in ( (i, j, k) for i in range_n for j in range_n for k in range_n ): pass In [51]: %timeit list_comprehension_product(128) 1 loops, best of 3: 583 ms per loop In [52]: %timeit generator_expression_product(128) 1 loops, best of 3: 480 ms per loop
These benchmarks were run with python --version
:
2.6.7 (r267:88850, Jul 31 2011, 19:30:54) [GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)]
It does seem that the second loop is slower than the first, perhaps because of the tuple unpacking. You don't have to do that, and I find it makes the second loop faster to do it like this:
for ijk in itertools.product(range(n), repeat=3):
c[ijk] = a[ijk]
Of course, with numpy, you want to avoid looping over elements at all, and instead use numpy operations on the entire array at once. That way, all the looping, etc, is being done in C, and you'll get huge speedups.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With