Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is comparing two pointers with < undefined behavior if they are both cast to an integer type?

Let's say I have this code that copies one block of memory to another in a certain order based on their location:

void *my_memmove(void *dest, const void *src, size_t len)
{
    const unsigned char *s = (const unsigned char *)src;
    unsigned char *d = (unsigned char *)dest;

    if(dest < src)
    {
        /* copy s to d forwards */
    }
    else
    {
        /* copy s to d backwards */
    }

    return dest;
}

This is undefined behavior if src and dest do not point to members of the same array(6.8.5p5).

However, let's say I cast these two pointers to uintptr_t types:

#include <stdint.h>

void *my_memmove(void *dest, const void *src, size_t len)
{
    const unsigned char *s = (const unsigned char *)src;
    unsigned char *d = (unsigned char *)dest;

    if((uintptr_t)dest < (uintptr_t)src)
    {
        /* copy s to d forwards */
    }
    else
    {
        /* copy s to d backwards */
    }

    return dest;
}

Is this still undefined behavior if they're not members of the same array? If it is, what are some ways that I could compare these two locations in memory legally?

I've seen this question, but it only deals with equality, not the other comparison operators (<, >, etc).

like image 795
S.S. Anne Avatar asked Aug 24 '19 16:08

S.S. Anne


1 Answers

The conversion is legal but there is, technically, no meaning defined for the result. If instead you convert the pointer to void * and then convert to uintptr_t, there is slight meaning defined: Performing the reverse operations will reproduce the original pointer (or something equivalent).

It particular, you cannot rely on the fact that one integer is less than another to mean it is earlier in memory or has a lower address.

The specification for uintptr_t (C 2018 7.20.1.4 1) says it has the property that any valid void * can be converted to uintptr_t, then converted back to void *, and the result will compare equal to the original pointer.

However, when you convert an unsigned char * to uintptr_t, you are not converting a void * to uintptr_t. So 7.20.1.4 does not apply. All we have is the general definition of pointer conversions in 6.3.2.3, in which paragraphs 5 and 6 say:

An integer may be converted to any pointer type. Except as previously specified [involving zero for null pointers], the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.

Any pointer type may be converted to an integer type. Except as previously specified [null pointers again], the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type.

So these paragraphs are no help except they tell you that the implementation documentation should tell you whether the conversions are useful. Undoubtedly they are in most C implementations.

In your example, you actually start with a void * from a parameter and convert it to unsigned char * and then to uintptr_t. So the remedy there is simple: Convert to uintptr_t directly from the void *.

For situations where we have some other pointer type, not void *, then 6.3.2.3 1 is useful:

A pointer to void may be converted to or from a pointer to any object type. A pointer to any object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer.

So, converting to and from void * is defined to preserve the original pointer, so we can combine it with a conversion from void * to uintptr_t:

(uintptr_t) (void *) A < (uintptr_t) (void *) B

Since (void *) A must be able to produce the original A upon conversion back, and (uintptr_t) (void *) A must be able to produce its (void *) A, then (uintptr_t) (void *) A and (uintptr_t) (void *) B must be different if A and B are different.

And that is all we can say from the C standard about the comparison. Converting from pointers to integers might produce the address bits out of order or some other oddities. For example, they might produce a 32-bit integer contain a 16-bit segment address and a 16-bit offset. Some of those integers might have higher values for lower addresses while others have lower values for lower addresses. Worse, the same address might have two representations, so the comparison might indicate “less than” even though A and B refer to the same object.

like image 80
Eric Postpischil Avatar answered Nov 15 '22 06:11

Eric Postpischil