My problem is that
np.array([2**31], dtype=np.uint32) >> 32
does not return 0
, but returns array([2147483648], dtype=uint32)
instead. The same is true for
np.right_shift(np.array([2**31], dtype=np.uint32), 32)
(so I believe this is simply how >>
is implemented).
Interestingly, all these alternatives seem to work as expected, returning some kind of 0
:
print(
2**31 >> 32,
np.uint32(2**31) >> 32,
np.array(2**31, dtype=np.uint32) >> 32,
np.right_shift(2**31, 32),
np.right_shift([2**31], 32),
np.right_shift(np.uint32(2**31), 32),
np.right_shift(np.array(2**31, dtype=np.uint32), 32),
)
In particular, what is different between Numpy arrays representing 2147483648
and [2147483648]
?
I have seen this issue in JavaScript (Why does << 32 not result in 0 in javascript?) and C++ (Weird behavior of right shift operator (1 >> 32), Why is `int >> 32` not always zero?), but not yet in Python/Numpy. In fact, neither Python nor Numpy docs seem to be documenting this behavior:
https://docs.python.org/3/reference/expressions.html#shifting-operations
https://docs.scipy.org/doc/numpy/reference/generated/numpy.right_shift.html
The unsigned right shift operator ( >>> ) (zero-fill right shift) evaluates the left-hand operand as an unsigned number, and shifts the binary representation of that number by the number of bits, modulo 32, specified by the right-hand operand.
The left shift operator ( << ) shifts the first operand the specified number of bits, modulo 32, to the left. Excess bits shifted off to the left are discarded. Zero bits are shifted in from the right.
Left Shifts The left-shift operator causes the bits in shift-expression to be shifted to the left by the number of positions specified by additive-expression . The bit positions that have been vacated by the shift operation are zero-filled.
While not documented, numpy is mostly implemented in C and the shift operator in C (and C++) is not defined for shifts greater than or equal to the number of bits. So the result can be arbitrary.
If you look at the types of the examples that work you'll see why they work:
print(
type(2**31 >> 32),
type(np.uint32(2**31) >> 32),
type(np.array(2**31, dtype=np.uint32) >> 32),
type(np.right_shift(2**31, 32)),
np.right_shift([2**31], 32).dtype,
type(np.right_shift(np.uint32(2**31), 32)),
type(np.right_shift(np.array(2**31, dtype=np.uint32), 32)),
)
<class 'int'> <class 'numpy.int64'> <class 'numpy.int64'> <class 'numpy.int64'> int64 <class 'numpy.int64'> <class 'numpy.int64'>
The first uses Python's own int
type, while the others are all converted to numpy.int64
, where the behavior for a 32-bit shift is correct.
This is mostly due to the fact that scalar (zero-dimensional) arrays behave differently. And in the list
case that the default integer type for numpy is not numpy.uint32
.
On the other hand
print((np.array([2**31], dtype=np.uint32) >> 32).dtype)
uint32
So you run into the undefined behavior here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With