In C I see a lot of code that adds or assigns an integer literal to a <code>size_t</code> variable. <pre class="prettyprint"><code>size_t foo = 1; foo += 1; </code></pre> What conversion takes place here, and can it ever happen that a <code>size_t</code> is "upgraded" to an <code>int</code> and then converted back to a <code>size_t</code>? Would that still wraparound if I was at the max? <pre class="prettyprint"><code>size_t foo = SIZE_MAX; foo += 1; </code></pre> Is that defined behavior? It's an unsigned type <code>size_t</code> which is having a signed <code>int</code> added to it (that may be a larger type?) and the converted back to a <code>size_t</code>. Is there risk of signed integer overflow? Would it make sense to write something like <code>foo + bar + (size_t)1</code> instead of <code>foo + bar + 1</code>? I never see code like that, but I'm wondering if it's necessary if integer promotions are troublesome. C89 doesn't say how a <code>size_t</code> will be ranked or what exactly it is: <blockquote> The value of the result is implementation-defined, and its type (an unsigned integral type) is size_t defined in the header. </blockquote>

It depends, since <code>size_t</code> is an implementation-defined unsigned integral type. Operations involving a <code>size_t</code> will therefore introduce promotions, but these depend on what <code>size_t</code> actually is, and what other types involved in the expression actually are. If <code>size_t</code> was equivalent to a <code>unsigned short</code> (e.g. a 16-bit type) then <pre class="prettyprint"><code>size_t foo = 1; foo += 1; </code></pre> would (semantically) promote <code>foo</code> to a <code>int</code>, add <code>1</code>, and then convert the result back to <code>size_t</code> for storing in <code>foo</code>. (I say "semantically", because that is the meaning of the code according to the standard. A compiler is free to apply the "as if" rule - i.e. do anything it likes, as long as it delivers the same net effect). On another hand, if <code>size_t</code> was equivalent to a <code>long long unsigned</code> (e.g. a 64-bit signed type), then the same code would promote <code>1</code> to be of type <code>long long unsigned</code>, add that to the value of <code>foo</code>, and store the result back into <code>foo</code>. In both cases, the net result is the same unless an overflow occurs. In this case, there is no overflow, since an both <code>int</code> and <code>size_t</code> are guaranteed able to represent the values <code>1</code> and <code>2</code>. If an overflow occurs (e.g. adding a larger integral value), then the behaviour can vary. Overflow of a signed integral type (e.g. <code>int</code>) results in undefined behaviour. Overflow of an <code>unsigned</code> integral type uses modulo arithmetic. As to the code <pre class="prettyprint"><code>size_t foo = SIZE_MAX; foo += 1; </code></pre> it is possible to do the same sort of analysis. If <code>size_t</code> is equivalent to a <code>unsigned short</code> then <code>foo</code> would be converted to <code>int</code>. If <code>int</code> is equivalent to a <code>signed short</code>, it cannot represent the value of <code>SIZE_MAX</code>, so the conversion will overflow, and the result is undefined behaviour. If <code>int</code> is able to represent a larger range than <code>short int</code> (e.g. it is equivalent to <code>long</code>), then the conversion of <code>foo</code> to <code>int</code> will succeed, incrementing that value will succeed, and storing back to <code>size_t</code> will use modulo arithmetic and produce the result of <code>0</code>. If <code>size_t</code> is equivalent to <code>unsigned long</code>, then the value <code>1</code> will be converted to <code>unsigned long</code>, adding that to <code>foo</code> will use modulo arithmetic (i.e. produce a result of zero), and that will be stored into <code>foo</code>. It is possible to do similar analyses assuming that <code>size_t</code> is actually other unsigned integral types. Note: In modern systems, a <code>size_t</code> that is the same size or smaller than an <code>int</code> is unusual. However, such systems have existed (e.g. Microsoft and Borland C compilers targeting 16-bit MS-DOS on hardware with an 80286 CPU). There are also 16-bit microprocessors still in production, mostly for use in embedded systems with lower power usage and low throughput requirements, and C compilers that target them (e.g. Keil C166 compiler which targets the Infeon XE166 microprocessor family). [Note: I've never had reason to use the Keil compiler but, given its target platform, it would not be a surprise if it supports a 16-bit <code>size_t</code> that is the same size or smaller than the native <code>int</code> type on that platform].

Adding or assigning an integer literal to a size_t

Tags:

c

size-t

c89

In C I see a lot of code that adds or assigns an integer literal to a size_t variable.

size_t foo = 1;
foo += 1;

What conversion takes place here, and can it ever happen that a size_t is "upgraded" to an int and then converted back to a size_t? Would that still wraparound if I was at the max?

size_t foo = SIZE_MAX;
foo += 1;

Is that defined behavior? It's an unsigned type size_t which is having a signed int added to it (that may be a larger type?) and the converted back to a size_t. Is there risk of signed integer overflow?

Would it make sense to write something like foo + bar + (size_t)1 instead of foo + bar + 1? I never see code like that, but I'm wondering if it's necessary if integer promotions are troublesome.

C89 doesn't say how a size_t will be ranked or what exactly it is:

The value of the result is implementation-defined, and its type (an unsigned integral type) is size_t defined in the header.

839

asked Oct 22 '16 02:10

newguy

2 Answers

The current C standard allows for a possibility of an implementation that would cause undefined behavior when executing the following code, however such implementation does not exist, and probably never will:

size_t foo = SIZE_MAX;
foo += 1;

The type size_t is as unsigned type¹, with a minimum range:² [0,65535].

The type size_t may be defined as a synonym for the type unsigned short. The type unsigned short may be defined having 16 precision bits, with the range: [0,65535]. In that case the value of SIZE_MAX is 65535.

The type int may be defined having 16 precision bits (plus one sign bit), two's complement representation, and range: [-65536,65535].

The expression foo += 1, is equivalent to foo = foo + 1 (except that foo is evaluated only once but that is irrelevant here). The variable foo will get promoted using integer promotions³. It will get promoted to type int because type int can represent all values of type size_t and rank of size_t, being a synonym for unsigned short, is lower than the rank of int. Since the maximum values of size_t, and int are the same, the computation causes a signed overflow, causing undefined behavior.

This holds for the current standard, and it should also hold for C89 since it doesn't have any stricter restrictions on types.

Solution for avoiding signed overflow for any imaginable implementation is to use an unsigned int integer constant:

foo += 1u;

In that case if foo has a lower rank than int, it will be promoted to unsigned int using usual arithmetic conversions.

¹ (Quoted from ISO/IEC 9899/201x 7.19 Common definitions 2)
size_t
which is the unsigned integer type of the result of the sizeof operator;

² (Quoted from ISO/IEC 9899/201x 7.20.3 Limits of other integer types 2)
limit of size_t
SIZE_MAX 65535

³ (Quoted from ISO/IEC 9899/201x 6.3.1.1 Boolean, characters, and integers 2)
The following may be used in an expression wherever an int or unsigned int may be used:
An object or expression with an integer type (other than int or unsigned int) whose integer conversion rank is less than or equal to the rank of int and unsigned int.
If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.

180

answered Sep 26 '22 02:09

2501

It depends, since size_t is an implementation-defined unsigned integral type.

Operations involving a size_t will therefore introduce promotions, but these depend on what size_t actually is, and what other types involved in the expression actually are.

If size_t was equivalent to a unsigned short (e.g. a 16-bit type) then

size_t foo = 1;
foo += 1;

would (semantically) promote foo to a int, add 1, and then convert the result back to size_t for storing in foo. (I say "semantically", because that is the meaning of the code according to the standard. A compiler is free to apply the "as if" rule - i.e. do anything it likes, as long as it delivers the same net effect).

On another hand, if size_t was equivalent to a long long unsigned (e.g. a 64-bit signed type), then the same code would promote 1 to be of type long long unsigned, add that to the value of foo, and store the result back into foo.

In both cases, the net result is the same unless an overflow occurs. In this case, there is no overflow, since an both int and size_t are guaranteed able to represent the values 1 and 2.

If an overflow occurs (e.g. adding a larger integral value), then the behaviour can vary. Overflow of a signed integral type (e.g. int) results in undefined behaviour. Overflow of an unsigned integral type uses modulo arithmetic.

As to the code

size_t foo = SIZE_MAX;
foo += 1;

it is possible to do the same sort of analysis.

If size_t is equivalent to a unsigned short then foo would be converted to int. If int is equivalent to a signed short, it cannot represent the value of SIZE_MAX, so the conversion will overflow, and the result is undefined behaviour. If int is able to represent a larger range than short int (e.g. it is equivalent to long), then the conversion of foo to int will succeed, incrementing that value will succeed, and storing back to size_t will use modulo arithmetic and produce the result of 0.

If size_t is equivalent to unsigned long, then the value 1 will be converted to unsigned long, adding that to foo will use modulo arithmetic (i.e. produce a result of zero), and that will be stored into foo.

It is possible to do similar analyses assuming that size_t is actually other unsigned integral types.

Note: In modern systems, a size_t that is the same size or smaller than an int is unusual. However, such systems have existed (e.g. Microsoft and Borland C compilers targeting 16-bit MS-DOS on hardware with an 80286 CPU). There are also 16-bit microprocessors still in production, mostly for use in embedded systems with lower power usage and low throughput requirements, and C compilers that target them (e.g. Keil C166 compiler which targets the Infeon XE166 microprocessor family). [Note: I've never had reason to use the Keil compiler but, given its target platform, it would not be a surprise if it supports a 16-bit size_t that is the same size or smaller than the native int type on that platform].

answered Sep 27 '22 02:09

Peter

Related questions
                            
                                Event driven design in c
                            
                                bridging between two file descriptors
                            
                                How can we split one 100 GB file into hundred 1 GB file?
                            
                                Typecasting an array to pointer?
                            
                                Simulate button click using GTK+ using gtk_event_put and a GdkEventButton structure
                            
                                How to write VLC plugin that can interact with the operating system
                            
                                Precise Linux Timing - What Determines the Resolution of clock_gettime()?
                            
                                Is spinlock required for every interrupt handler?
                            
                                Visual Studio C++ link with psapi.lib
                            
                                How do I delete a specific line from text file in C?
                            
                                How do I concatenate wide string literals with PRId32, PRIu64, etc.?
                            
                                Why would the outcome of this shift left operation be deemed undefined?
                            
                                How to find out the intersection of two coplanar lines in C
                            
                                How can I add together two SSE registers
                            
                                Define Python class from C
                            
                                Are NULL and 0 completely equivalent in C?
                            
                                Why am I able to link without including ctype.h
                            
                                C test if variable is in read-only section
                            
                                Can header exist without being a file?
                            
                                Why can't I pass constant arrays as arguments?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With