Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Assign result of sizeof() to ssize_t

Tags:

c

posix

sizeof

It happened to me that I needed to compare the result of sizeof(x) to a ssize_t.

Of course GCC gave an error (lucky me (I used -Wall -Wextra -Werror)), and I decided to do a macro to have a signed version of sizeof().

#define ssizeof (ssize_t)sizeof

And then I can use it like this:

for (ssize_t i = 0; i < ssizeof(x); i++)

The problem is, do I have any guarantees that SSIZE_MAX >= SIZE_MAX? I imagine that sadly this is never going to be true.

Or at least that sizeof(ssize_t) == sizeof(size_t), which would cut half of the values but would still be close enough.

I didn't find any relation between ssize_t and size_t in the POSIX documentation.

Related question:

What type should be used to loop through an array?

like image 502
alx Avatar asked Dec 06 '22 10:12

alx


2 Answers

There is no guarantee that SSIZE_MAX >= SIZE_MAX. In fact, it is very unlikely to be the case, since size_t and ssize_t are likely to be corresponding unsigned and signed types, so (on all actual architectures) SIZE_MAX > SSIZE_MAX. Casting an unsigned value to a signed type which cannot hold that value is Undefined Behaviour. So technically, your macro is problematic.

In practice, at least on 64-bit platforms, you're unlikely to get into trouble if the value you are converting to ssize_t is the size of an object which actually exists. But if the object is theoretical (eg sizeof(char[3][1ULL<<62])), you might get an unpleasant surprise.

Note that the only valid negative value of type ssize_t is -1, which is an error indication. You might be confusing ssize_t, which is defined by Posix, with ptrdiff_t, which is defined in standard C since C99. These two types are the same on most platforms, and are usually the signed integer type corresponding to size_t, but none of those behaviours is guaranteed by either standard. However, the semantics of the two types are different, and you should be aware of that when you use them:

  • ssize_t is returned by a number of Posix interfaces in order to allow the function to signal either a number of bytes processed or an error indication; the error indication must be -1. There is no expectation that any possible size will fit into ssize_t; the Posix rationale states that:

    A conforming application would be constrained not to perform I/O in pieces larger than {SSIZE_MAX}.

    This is not a problem for most of the interfaces which return ssize_t because Posix generally does not require interfaces to guarantee to process all data. For example, both read and write accept a size_t which describes the length of the buffer to be read/written and return an ssize_t which describes the number of bytes actually read/written; the implication is that no more than SSIZE_MAX bytes will be read/written even if more data were available. However, the Posix rationale also notes that a particular implementation may provide an extension which allows larger blocks to be processed ("a conforming application using extensions would be able to use the full range if the implementation provided an extended range"), the idea being that the implementation could, for example, specify that return values other than -1 were to be interpreted by casting them to size_t. Such an extension would not be portable; in practices, most implementations do limit the number of bytes which can be processed in a single call to the number which can be reported in ssize_t.

  • ptrdiff_t is (in standard C) the type of the result of the difference between two pointers. In order for subtraction of pointers to be well defined, the two pointers must refer to the same object, either by pointing into the object or by pointing at the byte immediately following the object. The C committee recognised that if ptrdiff_t is the signed equivalent of size_t, then it is possible that the difference between two pointers might not be representable, leading to undefined behaviour, but they preferred that to requiring that ptrdiff_t be a larger type than size_t. You can argue with this decision -- many people have -- but it has been in place since C90 and it seems unlikely that it will change now. (Current standard wording from , §6.5.6/9: "If the result is not representable in an object of that type [ptrdiff_t], the behavior is undefined.")

    As with Posix, the C standard does not define undefined behaviour, so it would be a mistake to interpret that as forbidding the subtraction of two pointers in very large objects. An implementation is always allowed to define the result of behaviour left undefined by the standard, so that it is completely valid for an implementation to specify that if P and Q are two pointers to the same object where P >= Q, then (size_t)(P - Q) is the mathematically correct difference between the pointers even if the subtraction overflows. Of course, code which depends on such an extension won't be fully portable, but if the extension is sufficiently common that might not be a problem.

As a final point, the ambiguity of using -1 both as an error indication (in ssize_t) and as a possibly castable result of pointer subtraction (in ptrdiff_t) is not likely to be a present in practice provided that size_t is as large as a pointer. If size_t is as large as a pointer, the only way that the mathematically correct value of P-Q could be (size_t)(-1) (aka SIZE_MAX) is if the object that P and Q refer to is of size SIZE_MAX, which, given the assumption that size_t is the same width as a pointer, implies that the object plus the following byte occupy every possible pointer value. That contradicts the requirement that some pointer value (NULL) be distinct from any valid address, so we can conclude that the true maximum size of an object must be less than SIZE_MAX.

like image 135
rici Avatar answered Dec 11 '22 10:12

rici


Please note that you can't actually do this.

The largest possible object in x86 Linux is just below 0xB0000000 in size, while SSIZE_T_MAX is 0x7FFFFFFF.

I haven't checked if read and stuff actually can handle the largest possible objects, but if they can it worked like this:

ssize_t result = read(fd, buf, count);
if (result != -1) {
    size_t offset = (size_t) result;
    /* handle success */
} else {
    /* handle failure */
}

You may find libc is busted. If so, this would work if the kernel is good:

ssize_t result = sys_read(fd, buf, count);
if (result >= 0 || result < -256) {
    size_t offset = (size_t) result;
    /* handle success */
} else {
    errno = (int)-result;
    /* handle failure */
}
like image 40
Joshua Avatar answered Dec 11 '22 10:12

Joshua