I have the following two (supposedly equivalent) functions, to see which one executes faster (will be used to process a large data set)
import numpy as np
def interval_energy(array, start_intensity, intensity_window_length):
bins = np.bincount(array.ravel())
energy = 0
for i in range(start_intensity, min(start_intensity + intensity_window_length, len(bins))):
energy += bins[i] * (i ** 2)
print("Energy: {}".format(energy))
return energy
def interval_energy2(array, start_intensity, intensity_window_length):
flat_array = array.ravel()
energy = 0
for i in range(0, array.size):
if start_intensity <= flat_array[i] < (start_intensity + intensity_window_length):
energy += flat_array[i] ** 2
print("Energy2: {}".format(energy))
return energy
i'm using the following code to test the code:
if __name__ == '__main__':
import timeit
setup = """
from interval_energy import interval_energy, interval_energy2
import numpy as np
a = np.random.randint(0, 3000, (150, 150, 150))
"""
t = timeit.Timer('interval_energy(a, 50, 2450)', setup)
t2 = timeit.Timer('interval_energy2(a, 50, 2450)', setup)
t3 = timeit.Timer("""
interval_energy(a, 50, 2450)
interval_energy2(a, 50, 2450)
""", setup)
print(t.timeit(5))
print(t2.timeit(5))
print(t3.timeit(5))
in interval_energy2 however, the energy variable overflows with this error being raised:
RuntimeWarning: overflow encountered in long_scalars
Update 1: I have noticed that in the first version, energy is of type int when its created and int64 when its returned, whereas in the second version of the function it is also of type int when its created however stays int32 until the point where it is returned. thus the overflow. Why does Python automatically convert the variable in one case but not in the other
Update 2: its been established that the two functions in theory produce the same result.
Update 3: I'm using Python3.5.2 64bit. I have read that Python3 ONLY uses long, so what I see here (32bit integer overflow) should not even be possible? possible because of c-stack of pandas /numpy
Update 4: Possible bug with CPython for windows, as the identical code works fine on OSX / unix (same python, numpy versions used on both systems)
Found it. This is a good question:
print type(flat_array[3])
<type 'numpy.int32'>
but, after the bincount:
print type(bins[3])
<type 'numpy.int64'>
apparently the binning converted the data type, without you noticing! This is why the fix by f5r5e5d worked. So you should have got an error on both, but the first got spared. Change your array definition:
a = np.random.randint(0, 3000, (150, 150, 150),dtype=np.int64)
as f5r5e5d suggested. I get no error and close, but not identical results - that's up to you.
EDIT
Currently it seems like on versions after 2.7.9, where dtype is an allowed keyword of array, the default dtype is according to the values given to the array. Using energy=np.int64() will make sure the variable we expect to overflow is a large int.
I assume you wanted flat_array in energy2 for loop range?
I changed the interval test to use "and"
and changed data type to dtype='int64' in my cut down of your init:
import numpy as np
def interval_energy(array, start_intensity, intensity_window_length):
bins = np.bincount(array.ravel())
energy = 0
for i in range(start_intensity, min(start_intensity + intensity_window_length, len(bins))):
energy += bins[i] * (i ** 2)
print("Energy: {}".format(energy))
return energy
def interval_energy2(array, start_intensity, intensity_window_length):
flat_array = array.ravel()
energy = 0
for i in range(0, flat_array.size):
if start_intensity <= flat_array[i] and flat_array[i] < (start_intensity + intensity_window_length):
energy += flat_array[i] ** 2
print("Energy2: {}".format(energy))
return energy
import numpy as np
a = np.random.randint(0, 3000, (150, 150, 150), dtype='int64')
interval_energy(a, 50, 2450)
interval_energy2(a, 50, 2450)
in Spyder I get:
In [53]:
import numpy as np
a = np.random.randint(0, 3000, (150, 150, 150), dtype='int64')
interval_energy(a, 50, 2450)
interval_energy2(a, 50, 2450)
Energy: 5859327673866
Energy2: 5859327673866
Out[53]: 5859327673866
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With