Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Checking equality of arbitrary C types using Objective-C @encode -- already exists?

Tags:

objective-c

I'm looking for some code to do a general purpose equality comparison of arbitrary C types as supported by Objective-C's @encode() directive. I'm essentially looking for a function like:

BOOL CTypesEqual(void* a, const char* aEnc, void* b, const char* bEnc)

Which you might call like this:

struct { ... } foo = { yadda, yadda, yadda };
struct { ... } bar = { yadda, yadda, yadda };

BOOL isEqual = CTypesEqual(&foo, @encode(typeof(foo)), &bar, @encode(typeof(bar)));

Here's what I've discovered so far:

Revelation #1

You can't do this:

BOOL CTypesEqual(void* a, const char* aEnc, void* b, const char * bEnc)
{
    if (0 != strcmp(aEnc, bEnc)) // different types
        return NO;

    NSUInteger size = 0, align = 0;
    NSGetSizeAndAlignment(aEnc, &size, &align);
    if (0 != memcmp(a, b, size))
        return NO;

    return YES;
}

...because of garbage in the spaces between members created by alignment restrictions. For instance, the following will fail the memcmp based equality check, despite the two structs being equal for my purposes:

typedef struct {
    char first;
    NSUInteger second;
} FooType;

FooType a, b;
memset(&a, 0x55555555, sizeof(FooType));
memset(&b, 0xAAAAAAAA, sizeof(FooType));

a.first = 'a';
a.second = ~0;

b.first = 'a';
b.second = ~0;

Revelation #2

You can abuse NSCoder to do this, like so:

BOOL CTypesEqual(void* a, const char* aEnc, void* b, const char * bEnc)
{
    if (0 != strcmp(aEnc, bEnc)) // different types
        return NO;

    NSMutableData* aData = [[NSMutableData alloc] init];
    NSArchiver* aArchiver = [[NSArchiver alloc] initForWritingWithMutableData: aData];
    [aArchiver encodeValueOfObjCType: aEnc at: a];

    NSMutableData* bData = [[NSMutableData alloc] init];
    NSArchiver* bArchiver = [[NSArchiver alloc] initForWritingWithMutableData: bData];
    [bArchiver encodeValueOfObjCType: bEnc at: b];

    return [aData isEqual: bData];
}

That's great and all, and provides the expected results, but results in who knows how many heap allocations (at least 6) and makes an operation that should be relatively cheap, very expensive.

Revelation #3

You can't use NSValue for this. As in, the following does not work:

typedef struct {
    char first;
    NSUInteger second;
} FooType;

FooType a, b;
memset(&a, 0x55555555, sizeof(FooType));
memset(&b, 0xAAAAAAAA, sizeof(FooType));

a.first = 'a';
a.second = 0xFFFFFFFFFFFFFFFF;

b.first = 'a';
b.second = 0xFFFFFFFFFFFFFFFF;

NSValue* aVal = [NSValue valueWithBytes: &a objCType: @encode(typeof(a))];
NSValue* bVal = [NSValue valueWithBytes: &b objCType: @encode(typeof(b))];

BOOL isEqual = [aVal isEqual: bVal];

Revelation #4

Cocotron's NSCoder implementation punts on all the hard stuff (arbitrary structs, unions, etc.), thus is no source of further inspiration.

My attempt so far

So I started in on this, docs in hand, and I roughly got this far:

BOOL CTypesEqual(void* a, const char* aEnc, void* b, const char * bEnc)
{
    if (0 != strcmp(aEnc, bEnc)) // different types
        return NO;

    return SameEncCTypesEqual(a, b, aEnc);
}

static BOOL SameEncCTypesEqual(void* a, void* b, const char* enc)
{
    switch (enc[0])
    {
        case 'v':
        {
            // Not sure this can happen, but...
            return YES;
        }
        case 'B':
        case 'c':
        case 'C':
        case 's':
        case 'S':
        case 'i':
        case 'I':
        case 'l':
        case 'L':
        case 'q':
        case 'Q':
        case 'f':
        case 'd':
        case '@':
        case '#':
        {
            NSUInteger size = 0, align = 0;
            NSGetSizeAndAlignment(enc, &size, &align);
            const int result = memcmp(a, b, size);
            if (result)
                return NO;
            break;
        }
        case ':':
        {
            if (!sel_isEqual(*(SEL*)a, *(SEL*)b))
                return NO;
        }

        case '*':
        {
            if (strcmp((const char *)a, (const char *)b))
                return NO;
        }
        case '{':
        {
            // Get past the name
            for (const char *prev = enc - 1, *orig = enc; prev < orig || (prev[0] != '=' && prev[0] != '\0' && enc[0] != '}'); prev++, enc++);

            // Chew through it
            for (NSUInteger pos = 0, size = 0, align = 0; enc[0] != '}' && enc[0] != '\0'; enc++, pos += size, size = 0, align = 0)
            {
                NSGetSizeAndAlignment(enc, &size, &align);

                // figure out where we should be w/r/t alignment
                pos = align * (pos + align - 1) / align;

                // Descend
                BOOL sub = SameEncCTypesEqual(((uint8_t*)a) + pos, ((uint8_t*)b) + pos, enc);
                if (!sub)
                    return NO;
            }
            break;
        }
        case '[':
        {
            // Skip the '['
            enc++;

            // Get numElements
            int numElements = 0;
            sscanf(enc, "%d", &numElements);

            // Advance past the number
            for (; enc[0] <= '9' && enc[0] >= '0'; enc++);

            // Get the size
            NSUInteger size = 0, align = 0;
            const char * const elementType = enc;
            NSGetSizeAndAlignment(elementType, &size, &align);

            for (NSUInteger i = 0; i < numElements; i++)
            {
                BOOL elementEqual = SameEncCTypesEqual(((uint8_t*)a) + i * size, ((uint8_t*)b) + i * size, elementType);
                if (!elementEqual)
                    return NO;
            }
            break;
        }
        case '(':
        {
            NSLog(@"unions?! seriously, bro?");
            return NO;
            break;
        }
        default:
        {
            NSLog(@"Unknown type: %s", enc);
            return NO;
            break;
        }
    }
    return YES;
}

...and about when I got to unions, I said to myself, "Self, why are you doing this? This is exactly the sort of code with a million little corner cases to miss, and really, it seems like something that should have been written a bunch of times already, by other people with way more patience." So here I am. Anyone know of a tried-and-true implementation of this in the (public) frameworks or "in the wild" that doesn't come with all the extra weight of using NSCoder?

like image 552
ipmcc Avatar asked Nov 23 '13 23:11

ipmcc


2 Answers

Why not just make a function that skips the padding?

First, you would need to guess the padding policy. That's the easy part.

Then you use the encoding information provided by @encode to map the data types to masks, and use those masks to compare.

A trivial example:

struct m { char a; int b; };
struct n { char c; struct m d; int e; };

Could be transformed into (lets assume sizeof(int) is 4):

struct_m_mask = { 1, 4, 0 };
struct_n_mask = { 1, 1, 4, 4, 0 };

Optimization of the representation is of course possible in the case the alignment allows, e.g.:

struct_b_mask = { 2, 4, 4, 0 };

Then, you could walk this array to do the comparisons. A[n+1] - A[n] gives the hole size, if there's no hole ahead, like in the case of (b, e), then you can merge them.

That's the simplest way I could come up with. Probably you can implement more complex tricks.

BTW, my guess is that given the stuff is constant, the compiler could compute the masks at compile-time, but perhaps that's too much asking...

like image 95
Ismael Luceno Avatar answered Oct 22 '22 18:10

Ismael Luceno


Regarding the padding issue you may have to make something like a function for each struct that turns it into an array for comparison or returns a single value at an index. It isn't very optimal but that is usually why sorting algorithms let you pass in function pointers that do the actual comparison.

like image 40
0xFADE Avatar answered Oct 22 '22 18:10

0xFADE