14

My problem is that

np.array([2**31], dtype=np.uint32) >> 32

does not return 0, but returns array([2147483648], dtype=uint32) instead. The same is true for

np.right_shift(np.array([2**31], dtype=np.uint32), 32)

(so I believe this is simply how >> is implemented).

Interestingly, all these alternatives seem to work as expected, returning some kind of 0:

print(
    2**31 >> 32,
    np.uint32(2**31) >> 32,
    np.array(2**31, dtype=np.uint32) >> 32,
    np.right_shift(2**31, 32),
    np.right_shift([2**31], 32),
    np.right_shift(np.uint32(2**31), 32),
    np.right_shift(np.array(2**31, dtype=np.uint32), 32),
)

In particular, what is different between Numpy arrays representing 2147483648 and [2147483648]?

I have seen this issue in JavaScript (Why does << 32 not result in 0 in javascript?) and C++ (Weird behavior of right shift operator (1 >> 32), Why is `int >> 32` not always zero?), but not yet in Python/Numpy. In fact, neither Python nor Numpy docs seem to be documenting this behavior:

2
  • 1
    That's quite strange ya, and i can't give you a reason for that, but note that doing >> 31 returns you 1 and << 16 returns you 0 which are expected results. Maybe by doing >> 32 it changes its type into a double or a thing like that
    – Nenri
    Commented Mar 11, 2019 at 8:14
  • 1
    Actually any number shifted by >>32 remains the same. So np.array([123], dtype=np.uint32) >> 32 equals np.array([123], dtype=np.uint32) And even more (np.array([123], dtype=np.uint32) >> 33) equals (np.array([123], dtype=np.uint32) >> 1) Really not expected. Commented Mar 11, 2019 at 8:19

1 Answer 1

8

While not documented, numpy is mostly implemented in C and the shift operator in C (and C++) is not defined for shifts greater than or equal to the number of bits. So the result can be arbitrary.

If you look at the types of the examples that work you'll see why they work:

print(
    type(2**31 >> 32),
    type(np.uint32(2**31) >> 32),
    type(np.array(2**31, dtype=np.uint32) >> 32),
    type(np.right_shift(2**31, 32)),
    np.right_shift([2**31], 32).dtype,
    type(np.right_shift(np.uint32(2**31), 32)),
    type(np.right_shift(np.array(2**31, dtype=np.uint32), 32)),
)

<class 'int'> <class 'numpy.int64'> <class 'numpy.int64'> <class 'numpy.int64'> int64 <class 'numpy.int64'> <class 'numpy.int64'>

The first uses Python's own int type, while the others are all converted to numpy.int64, where the behavior for a 32-bit shift is correct. This is mostly due to the fact that scalar (zero-dimensional) arrays behave differently. And in the list case that the default integer type for numpy is not numpy.uint32.

On the other hand

print((np.array([2**31], dtype=np.uint32) >> 32).dtype)

uint32

So you run into the undefined behavior here.

7
  • 2
    As type(np.uint32(2**31) is numpy.int64, so np.array(2**31, dtype=np.uint32) >> 64 will give non zero again. Commented Mar 11, 2019 at 8:48
  • @LeonidMednikov yes, if you start shifting those 64-bit integers by 64 bits or more you will run into similar issues.
    – Joe
    Commented Mar 11, 2019 at 8:49
  • Thanks for the explanation, good idea to look at types. I get that the observed effect may be due to undefined behavior, but should it be undefined at all? Shouldn't the definition of the language be independent of its implementation? In other words, shouldn't at least the definition (= documentation) of the language be amended?
    – bers
    Commented Mar 11, 2019 at 12:10
  • 1
    "This is mostly due to the fact that scalar (zero-dimensional) arrays behave differently." This is true independent of shifts - I was not aware of that: (np.array(1, dtype=np.uint32) + 1).dtype is int64, while (np.array([1], dtype=np.uint32) + 1).dtype is uint32. Got it.
    – bers
    Commented Mar 11, 2019 at 12:17
  • 2
    @bers I think this should be considered a bug in NumPy so I have reported it.
    – javidcf
    Commented Mar 11, 2019 at 15:08

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.