I'm trying to use numpy to element-wise square an array. I've noticed that some of the values appear as negative numbers. The squared value isn't near the max int limit. Does anyone know why this is happening and how I could fix it? I'd rather avoid using a for loop to square an array element-wise, since my data set is quite large.
Here's an example of what is happening:
import numpy as np
test = [1, 2, 47852]
sq = np.array(test)**2
print(sq)
print(47852*47852)
Output:
[1,4, -2005153392]
2289813904
This is because NumPy doesn't check for integer overflow - likely because that would slow down every integer operation, and NumPy is designed with efficiency in mind. So when you have an array of 32-bit integers and your result does not fit in 32 bits, it is still interpreted as 32-bit integer, giving you the strange negative result.
To avoid this, you can be mindful of the dtype
you need to perform the operation safely, in this case 'int64'
would suffice.
>>> np.array(test, dtype='int64')**2
2289813904
You aren't seeing the same issue with Python int
's because Python checks for overflow and adjusts accordingly to a larger data type if necessary. If I recall, there was a question about this on the mailing list and the response was that there would be a large performance implication on atomic array ops if the same were done in NumPy.
As for why your default integer type may be 32-bit on a 64-bit system, as Goyo answered on a related question, the default integer np.int_
type is the same as C long, which is platform dependent but can be 32-bits.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With