Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C overcoming aliasing restrictions (unions?)

Assume I have a sample source file, test.c, which I am compiling like so:

$ gcc -03 -Wall

test.c looks something like this ..

/// CMP128(x, y)
//
// arguments
//  x - any pointer to an 128-bit int
//  y - any pointer to an 128-bit int
//
// returns -1, 0, or 1 if x is less than, equal to, or greater than y
//
#define CMP128(x, y) // magic goes here

// example usages

uint8_t  A[16];
uint16_t B[8];
uint32_t C[4];
uint64_t D[2];
struct in6_addr E;
uint8_t* F;

// use CMP128 on any combination of pointers to 128-bit ints, i.e.

CMP128(A, B);
CMP128(&C[0], &D[0]);
CMP128(&E, F);

// and so on

let's also say I accept the restriction that if you pass in two overlapping pointers, you get undefined results.

I've tried something like this (imagine these macros are properly formatted with backslash-escaped newlines at the end of each line)

#define CMP128(x, y) ({
  uint64_t* a = (void*)x;
    uint64_t* b = (void*)y;

  // compare a[0] with b[0], a[1] with b[1]
})

but when I dereference a in the macro (a[0] < b[0]) I get "dereferencing breaks strict-aliasing rules" errors from gcc

I had thought that you were supposed to use unions to properly refer to a single place in memory in two different ways, so next I tried something like

#define CMP128(x, y) ({
    union {
        typeof(x) a;
        typeof(y) b;
        uint64_t* c;
    }   d = { .a = (x) }
        , e = { .b = (y) };

    // compare d.c[0] with e.c[0], etc
})

Except that I get the exact same errors from the compiler about strict-aliasing rules.

So: is there some way to do this without breaking strict-aliasing, short of actually COPYING the memory?

(may_alias doesnt count, it just allows you to bypass the strict-aliasing rules)

EDIT: use memcmp to do this. I got caught up on the aliasing rules and didn't think of it.

like image 733
Todd Freed Avatar asked Jun 26 '11 21:06

Todd Freed


1 Answers

The compiler is correct as the aliasing rules are determined by the so-called 'effective type' of the object (ie memory location) you're accessing, regardless of any pointer magic. In this case, type-punning the pointers with a union is no different than an explicit cast - using the cast is actually preferable as the standard does not guarantee that arbitary pointer types have compatible representations, ie you're unnecessarily depending on implementation-defined behaviour.

If you want to conform to the standard, you need to copy the data to new variables or use a union during the declaration of the original variables.

If your 128-bit integers are either big-endian or little-endian (ie not mixed-endian), you could also use memcmp() (either directly or after negating the return value) or do a byte-wise comparison yourself: access through pointers of character type is an exception to the aliasing rule.

like image 95
Christoph Avatar answered Oct 23 '22 22:10

Christoph