Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why would you need unsigned types in Java?

I have often heard complaints against Java for not having unsigned data types. See for example this comment. I would like to know how is this a problem? I have been programming in Java for 10 years more or less and never had issues with it. Occasionally when converting bytes to ints a & 0xFF is needed, but I don't consider that as a problem.

Since unsigned and signed numbers are represented with the same bit values, the only places I can think of where signedness matters are:

  • When converting the numbers to other bit representation. Between 8, 16 and 32 bit integer types you can use bitmasks if needed.
  • When converting numbers to decimal format, usually to Strings.
  • Interoperating with non-Java systems through API's or protocols. Again the data is just bits, so I don't see the problem here.
  • Using the numbers as memory or other offsets. With 32 bit ints this might be problem for very huge offsets.

Instead I find it easier that I don't need to consider operations between unsigned and signed numbers and the conversions between those. What am I missing? What are the actual benefits of having unsigned types in a programming language and how would having those make Java better?

like image 711
msell Avatar asked Mar 19 '26 21:03

msell


1 Answers

Occasionally when converting bytes to ints a & 0xFF is needed, but I don't consider that as a problem.

Why not? Is "applying a bitwise AND with 0xFF" actually part of what your code is trying to represent? If not, why should it have to be part of have you write it? I actually find that almost anything I want to do with bytes beyond just copying them from one place to another ends up requiring a mask. I want my code to be cruft-free; the lack of unsigned bytes hampers this :(

Additionally, consider an API which will always return a non-negative value, or only accepts non-negative values. Using an unsigned type allows you to express that clearly, without any need for validation. Personally I think it's a shame that unsigned types aren't used more in .NET, e.g. for things like String.Length, ICollection.Count etc. It's very common for a value to naturally only be non-negative.

Is the lack of unsigned types in Java a fatal flaw? Clearly not. Is it an annoyance? Absolutely.

The comment that you quote hits the nail on the head:

Java's lack of unsigned data types also stands against it. Yes, you can work around it, but it's not ideal and you'll be using code that doesn't really reflect the underlying data correctly.

Suppose you are interoperating with another system, which wants an unsigned 16 bit integer, and you want to represent the number 65535. You claim "the data is just bits, so I don't see the problem here" - but having to pass -1 to mean 65535 is a problem. Any impedance mismatch between the representation of your data and its underlying meaning introduces an extra speedbump when writing, reading and testing the code.

Instead I find it easier that I don't need to consider operations between unsigned and signed numbers and the conversions between those.

The only times you would need to consider those operations is when you were naturally working with values of two different types - one signed and one unsigned. At that point, you absolutely want to have that difference pointed out. With signed types being used to represent naturally unsigned values, you should still be considering the differences, but the fact that you should is hidden from you. Consider:

// This should be considered unsigned - so a value of -1 is "really" 65535
short length = /* some value */;
// This is really signed
short foo = /* some value */;

boolean result = foo < length;

Suppose foo is 100 and length is -1. What's the logical result? The value of length represents 65535, so logically foo is smaller than it. But you'd probably go along with the code above and get the wrong result.

Of course they don't even need to represent different types here. They could both be naturally unsigned values, represented as signed values with negative numbers being logically greater than positive ones. The same error applies, and wouldn't be a problem if you had unsigned types in the language.

You might also want to read this interview with Joshua Bloch (Google cache, as I believe it's gone from java.sun.com now), including:

Ooh, good question... I'm going to say that the strangest thing about the Java platform is that the byte type is signed. I've never heard an explanation for this. It's quite counterintuitive and causes all sorts of errors.

like image 87
Jon Skeet Avatar answered Mar 21 '26 09:03

Jon Skeet