I'm surprised by C# compiler behavior in the following example: <pre class="prettyprint"><code>int i = 1024; uint x = 2048; x = x+i; // A error CS0266: Cannot implicitly convert type 'long' to 'uint' ... </code></pre> It seems OK as <code>int + uint</code> can overflow. However, if <code>uint</code> is changed to <code>int</code>, then error disappears, like <code>int + int</code> cannot give overflow: <pre class="prettyprint"><code>int i = 1024; int x = 2048; x = x+i; // OK, int </code></pre> Moreover, <code>uint + uint = uint</code>: <pre class="prettyprint"><code>uint i = 1024; uint x = 2048; x = x+i; // OK, uint </code></pre> It seems totally obscure. Why <code>int + int = int</code> and <code>uint + uint = uint</code>, but <code>int + uint = long</code>? What is the motivation for this decision?

<blockquote> Why int + int = int and uint + uint = uint, but int + uint = long? What is the motivation for this decision? </blockquote> The way the question is phrased implies the presupposition that the design team wanted int + uint to be long, and chose type rules to attain that goal. That presupposition is false. Rather, the design team thought: <ul> <li>What mathematical operations are people most likely to perform?</li> <li>What mathematical operations can be performed safely and efficiently?</li> <li>What conversions between numeric types can be performed without loss of magnitude and precision?</li> <li>How can the rules for operator resolution be made both simple and consistent with the rules for method overload resolution?</li> </ul> As well as many other considerations such as whether the design works for or against debuggable, maintainable, versionable programs, and so on. (I note that I was not in the room for this particular design meeting, as it predated my time on the design team. But I have read their notes and know the kinds of things that would have concerned the design team during this period.) Investigating these questions led to the present design: that arithmetic operations are defined as int + int --> int, uint + uint --> uint, long + long --> long, int may be converted to long, uint may be converted to long, and so on. A consequence of these decisions is that when adding uint + int, overload resolution chooses long + long as the closest match, and long + long is long, therefore uint + int is long. Making uint + int have some more different behavior that you might consider more sensible was not a design goal of the team at all because mixing signed and unsigned values is first, rare in practice, and second, almost always a bug. The design team could have added special cases for every combination of signed and unsigned one, two, four, and eight byte integers, as well as char, float, double and decimal, or any subset of those many hundreds of cases, but that works against the goal of simplicity. So in short, on the one hand we have a large amount of design work to make a feature that we want no one to actually use easier to use at the cost of a massively complicated specification. On the other hand we have a simple specification that produces an unusual behavior in a rare case we expect no one to encounter in practice. Given those choices, which would you choose? The C# design team chose the latter.

The short answer is "because the Standard says that it shall be so", which see the informative §14.2.5.2 of ISO 23270. The normative §13.1.2. (Implicit numeric conversions) says: <blockquote> The implicit numeric conversions are: ... <ul> <li>From <code>int</code> to <code>long</code>, <code>float</code>, <code>double</code>, or <code>decimal</code>.</li> <li>From <code>uint</code> to <code>long</code>, <code>ulong</code>, <code>float</code>, <code>double</code>, or <code>decimal</code>.</li> </ul> ... Conversions from <code>int</code>, <code>uint</code>, <code>long</code> or <code>ulong</code> to <code>float</code> and from <code>long</code> or <code>ulong</code> to <code>double</code> can cause a loss of precision, but will never cause a loss of magnitude. The other implicit numeric conversions never lose any information. (emph. mine) </blockquote> The [slightly] longer answer is that you are adding two different types: a 32-bit signed integer and a 32-bit unsigned integer: <ul> <li>the domain of a signed 32-bit integer is -2,147,483,648 (0x80000000) — +2,147,483,647 (0x7FFFFFFF).</li> <li>the domain of an unsigned 32-bit integer is 0 (0x00000000) — +4,294,967,295 (0xFFFFFFFF).</li> </ul> So the types aren't compatable, since an <code>int</code> can't contain any arbitrary <code>uint</code> and a <code>uint</code> can't contain any arbitrary <code>int</code>. They are implicitly converted (a widening conversion, per the requirement of §13.1.2 that no information be lost) to the next largest type that can contain both: a <code>long</code> in this case, a signed 64-bit integer, which has the domain -9,223,372,036,854,775,808 (0x8000000000000000) — +9,223,372,036,854,775,807 (0x7FFFFFFFFFFFFFFF). Edited to note: Just as an aside, Executing this code: <pre class="prettyprint"><code>var x = 1024 + 2048u ; Console.WriteLine( "'x' is an instance of `{0}`" , x.GetType().FullName ) ; </code></pre> does not yield a <code>long</code> as the original poster's example. Instead, what is produced is: <pre class="prettyprint"><code>'x' is an instance of `System.UInt32` </code></pre> This is because of constant folding. The first element in the expression, <code>1024</code> has no suffix and as such is an <code>int</code> and the second element in the expression <code>2048u</code> is a <code>uint</code>, according to the rules: <blockquote> <ul> <li>If the literal has no suffix, it has the first of these types in which its value can be represented: <code>int</code>, <code>uint</code>, <code>long</code>, <code>ulong</code>.</li> <li>If the literal is suffixed by <code>U</code> or <code>u</code>, it has the first of these types in which its value can be represented: <code>uint</code>, <code>ulong</code>.</li> </ul> </blockquote> And since the optimizer knows what the values are, the sum is precomputed and evaluated as a <code>uint</code>. Consistency is the hobgoblin of little minds.

Addition of int and uint

Tags:

I'm surprised by C# compiler behavior in the following example:

int i = 1024; uint x = 2048; x = x+i;     // A error CS0266: Cannot implicitly convert type 'long' to 'uint' ...

It seems OK as int + uint can overflow. However, if uint is changed to int, then error disappears, like int + int cannot give overflow:

int i = 1024; int x = 2048; x = x+i;     // OK, int

Moreover, uint + uint = uint:

uint i = 1024; uint x = 2048; x = x+i;     // OK, uint

It seems totally obscure.

Why int + int = int and uint + uint = uint, but int + uint = long?

What is the motivation for this decision?

827

asked Oct 14 '14 18:10

Anton K

2 Answers

Why int + int = int and uint + uint = uint, but int + uint = long? What is the motivation for this decision?

The way the question is phrased implies the presupposition that the design team wanted int + uint to be long, and chose type rules to attain that goal. That presupposition is false.

Rather, the design team thought:

What mathematical operations are people most likely to perform?
What mathematical operations can be performed safely and efficiently?
What conversions between numeric types can be performed without loss of magnitude and precision?
How can the rules for operator resolution be made both simple and consistent with the rules for method overload resolution?

As well as many other considerations such as whether the design works for or against debuggable, maintainable, versionable programs, and so on. (I note that I was not in the room for this particular design meeting, as it predated my time on the design team. But I have read their notes and know the kinds of things that would have concerned the design team during this period.)

Investigating these questions led to the present design: that arithmetic operations are defined as int + int --> int, uint + uint --> uint, long + long --> long, int may be converted to long, uint may be converted to long, and so on.

A consequence of these decisions is that when adding uint + int, overload resolution chooses long + long as the closest match, and long + long is long, therefore uint + int is long.

Making uint + int have some more different behavior that you might consider more sensible was not a design goal of the team at all because mixing signed and unsigned values is first, rare in practice, and second, almost always a bug. The design team could have added special cases for every combination of signed and unsigned one, two, four, and eight byte integers, as well as char, float, double and decimal, or any subset of those many hundreds of cases, but that works against the goal of simplicity.

So in short, on the one hand we have a large amount of design work to make a feature that we want no one to actually use easier to use at the cost of a massively complicated specification. On the other hand we have a simple specification that produces an unusual behavior in a rare case we expect no one to encounter in practice. Given those choices, which would you choose? The C# design team chose the latter.

answered Oct 30 '22 05:10

Eric Lippert

The short answer is "because the Standard says that it shall be so", which see the informative §14.2.5.2 of ISO 23270. The normative §13.1.2. (Implicit numeric conversions) says:

The implicit numeric conversions are:

...

From int to long, float, double, or decimal.

From uint to long, ulong, float, double, or decimal.

...

Conversions from int, uint, long or ulong to float and from long or ulong to double can cause a loss of precision, but will never cause a loss of magnitude. The other implicit numeric conversions never lose any information. (emph. mine)

The [slightly] longer answer is that you are adding two different types: a 32-bit signed integer and a 32-bit unsigned integer:

the domain of a signed 32-bit integer is -2,147,483,648 (0x80000000) — +2,147,483,647 (0x7FFFFFFF).
the domain of an unsigned 32-bit integer is 0 (0x00000000) — +4,294,967,295 (0xFFFFFFFF).

So the types aren't compatable, since an int can't contain any arbitrary uint and a uint can't contain any arbitrary int. They are implicitly converted (a widening conversion, per the requirement of §13.1.2 that no information be lost) to the next largest type that can contain both: a long in this case, a signed 64-bit integer, which has the domain -9,223,372,036,854,775,808 (0x8000000000000000) — +9,223,372,036,854,775,807 (0x7FFFFFFFFFFFFFFF).

Edited to note: Just as an aside, Executing this code:

var x = 1024 + 2048u ; Console.WriteLine( "'x' is an instance of `{0}`" , x.GetType().FullName ) ;

does not yield a long as the original poster's example. Instead, what is produced is:

'x' is an instance of `System.UInt32`

This is because of constant folding. The first element in the expression, 1024 has no suffix and as such is an int and the second element in the expression 2048u is a uint, according to the rules:

If the literal has no suffix, it has the first of these types in which its value can be represented: int, uint, long, ulong.

If the literal is suffixed by U or u, it has the first of these types in which its value can be represented: uint, ulong.

And since the optimizer knows what the values are, the sum is precomputed and evaluated as a uint.

Consistency is the hobgoblin of little minds.

answered Oct 30 '22 05:10

Nicholas Carey

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Addition of int and uint

Tags:

Anton K

People also ask

2 Answers

Eric Lippert

Nicholas Carey

Recent Activity

Donate For Us

Addition of int and uint

Tags:

Anton K

People also ask

2 Answers

Eric Lippert

Nicholas Carey

Related questions

Recent Activity

Donate For Us