First of all, what do I mean, by 'correct definition`?
For example, K&R in "C Programming Language" 2nd ed., in section 2.2 Data Types and Sizes, make very clear statements about integers:
- There are
short
,int
andlong
for integer types. They are needed to repesent values of different boundaries.int
is a "naturally" sized number for a specific hardware, so also probably the most fastest.- Sizes for integer types
short
,int
andlong
are purely implementation-dependent.- But they have restrictions.
short
andint
shall hold at least 16 bits.long
shall hold at least 32 bits.short
>=int
>=long
.
That's very clear and unambiguous. And that is not the case for size_t
type. In K&R 5.4 Address arithmetic, they say:
- ...
size_t
is the unsigned integer type returned by thesizeof
operator.- The
sizeof
operator yields the number of bytes required to store an object of the type of its operand.
In C99 standard draft, in 6.5.3.4 The sizeof operator, they say:
- The value of the result is implementation-defined, and its type (an unsigned integer type) is
size_t
, defined in<stddef.h>
(and other headers).
In 7.17 Common definitions :
size_t
which is the unsigned integer type of the result of the sizeof operator;
In 7.18.3 Limits of other integer types:
- limit of size_t
SIZE_MAX
65535
There is also a useful article - Why size_t matters. It says the following:
- Okay, let's try to imagine, what it would be if there would be no
size_t
.- For example, let's take
void *memcpy(void *s1, void const *s2, size_t n);
standard function from<string.h>
- Let's use
int
instead ofsize_t
forn
parameter.- But size of memory can't be negative, so let's better take
unsigned int
.- Good, seems like we are happy now and without
size_t
.- But
unsigned int
has limited size - what if there is some machine, that can copy chunks of memory larger thanunsigned int
can hold?- Okay, let's use
unsigned long
then, now we are happy?- But for those machines, which operate with smaller memory chunks,
unsigned long
would be inefficient, becauselong
is not "natural" for them, they must perform additional operations to work withlong
s.- So let's why we need
size_t
- to represent a size of memory, that particular hardware can operate at once. On some machines it would be equal toint
, on others - tolong
, depending on with which type they are most efficient.
What I understand from it is that size_t
is strictly bounded with sizeof
operator. And therefore size_t
represents a maximum size of an object in bytes. It might also represent a number of bytes that particular CPU model can move at once.
But there is still much of mystery for me here:
size_t
could be 32 bit too.int
has "natural" size for the platform, and it can be equal to int
or to long
. So why not use it instead of size_t
if it's "natural"?There is similar question:
What is size_t in C?
But the answers for it doesn't provide clear definition or links to authoritative sources (if not count Wikipedia as such).
I want to know when to use size_t
, when not to use size_t
, why it was introduced, and what it really represents.
when to use
size_t
Use size_t
to represent non-negative indexes, and to work with values that could be traced back to a sizeof
expression.
when to not use
size_t
Whenever a value could possibly be negative, e.g. when you subtract pointers. This is allowed for pointers into the same array, but it may yield a negative number, depending on the relative positions of pointers. There is another type ptrdiff_t
defined for this situation.
why it was introduced
Designers of the standard had a choice of introducing a separate type, or requiring an existing type to be capable of holding sizes. The first choice gives compiler writers more flexibility, so the designers went with a separate type.
what it really represents
It is capable of representing the size of an object in memory, be it an array, a struct
, an array of struct
s, an array of arrays of struct
s, or anything else. The size is expressed in bytes.
The type is also convenient for using for non-negative indexes, because it can represent an index to a structure of any size at the maximum granularity (i.e. an index into the largest possible array of char
s, because the standard requires char
to have the smallest possible size of 1
).
What is "object" in terms of C?
"Object" is a defined term. The C99 standard defines it as: "region of data storage in the execution environment, the contents of which can represent values" (section 3.14). A more colloquial definition might be "the storage in memory for a value." Objects come in different sizes, depending on the type of the value stored. That type includes not only simple types such as char
and int
, but also complex types such as structures and arrays. For example, the storage for an array is an object, within which is an object for each element.
Why it's limited to 65535, which is maximum number, that could be represented by 16 bits? The article on embedded.com says, that size_t could be 32 bit too.
You misunderstand. Re-read the first two paragraphs of section 7.18.3. SIZE_MAX
represents the maximum value of type size_t
, but its actual value is implementation dependent. The value given in the standard is the minimum value that that can be. In most implementations it is larger.
K&R says, that int has "natural" size for the platform, and it can be equal to int or to long. So why not use it instead of size_t if it's "natural"?
Because there is no particular reason that the maximum size of an object should be limited to the number of bytes expressible in a single machine word (which is pretty much what the "natural size" means). Nor, where int
and long
differ in size, is it clear which one should correspond to size_t
, if either. Using size_t
instead of one of these abstracts away machine details and makes your code more portable.
In response to the update:
I want to know, when to use size_t, when to not use size_t, why it was introduced and what it really represents.
size_t
is primarily defined as the type of the result of sizeof
. It follows that what it "really represents" is the size of an object.
Use size_t
to hold values that represent or are related to the size of an object. That is expressly what it's for. For the most part, you can accomplish this by type matching: use variables of type size_t
to store the values declared to have that type, such as the return values of certain functions (e.g. strlen()
) and the results of certain operators (e.g. sizeof
).
Do not use size_t
for values that represent something other than an object size or something closely related (such as the sum or positive difference of object sizes).
Why it's limited to 65535, which is maximum number, that could be represented by 16 bits?
Its atleast 16 bit.
According to the 1999 ISO C standard (C99), size_t is an unsigned integer type of at least 16 bit (see sections 7.17 and 7.18.3).
Why use size_t
?
size_t
is a type guaranteed to hold any array index.
size_t
can be any of (and also can be anything else other than these) unsigned char
, unsigned short
, unsigned int
, unsigned long
or unsigned long long
, depending on the implementation.
And to use unsigned int
or unsigned long
in place of size_t
,reason is similar that they are not the only unsigned integral types.
Its purpose is to relieve the programmer from having to worry about which of the predefined types is used to represent sizes.
On one system, it might make sense to use unsigned int
to represent sizes; on another, it might make more sense to use unsigned long
or unsigned long long
.
So using size_t
adds the advantage that code is likely to be more portable.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With