Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How thread-safe is my uniquing code?

I have a small value class that I create lots of instances of. Often with the same value. This class is used as a kind of identifier, so the main use is comparing instances of this class with each other (via isEqual:).

To save some memory and time comparing I keep only unique instances in a NSHashTable and use pointer comparison instead of isEqual:.

So my designated initializer looks like this:

- initWithStuff: (NSString *)stuff;
{
    self = [super init];
    if (!self) return nil;

    // ... (actual initialisation code omitted)

   hash = UniquingHashTable();

   static OSSpinLock spinlock = OS_SPINLOCK_INIT;
   OSSpinLockLock( &spinlock );

   id member = [hash member: self];
   if (member) self = member;
   else [hash addObject: self];

   OSSpinLockUnlock( &spinlock );

   return self;
}

UniquingHashTable() returns the global NSHashTable or creates it via [NSHashTable weakObjectsHashTable] if it doesn’t exist yet. The weakObjectsHashTable is the important bit - it stores weak pointers to the objects that automatically get removed once there is no other strong reference to that object. The initialization is thread-safe thanks to dispatch_once.

This works fine, peak memory usage is significantly lower and all my test cases still pass. But I am not quite sure if I can rely on pointer-comparison to test for equality. Is there any case or race condition where I get two distinct instances at the same time (so the pointers are different) but they still compare equal by isEqual:?

To clarify what I have/want:

Given

MyClass *a = [[MyClass alloc] initWithStuff: stringA];
MyClass *b = [[MyClass alloc] initWithStuff: stringB];

we always have

[a isEqual: b] == [stringA isEqual: stringB];

This is not changed with the new code.

What I am trying to achieve is that also

[a isEqual: b] == (a == b)

so that I can replace the isEqual: with a faster pointer comparison. (Yes, I measured this, this will be significant).

For single threaded code or if I was using a NSMutableSet instead of a weak NSHashTable this always works out. I’m just not sure if there is any race condition so that I can have the case where

[a isEqual: b] && (a != b)

is true.

like image 618
Sven Avatar asked Oct 05 '22 13:10

Sven


1 Answers

That code should be fine, but I would suggest a handful of possible changes. These are based on assumptions/observations and may or may not apply.

If it is possible, move the uniquing code to a class [factory] method and unique on the value of stuff. That may be pseudo-code, but it would appear that stuff is the sole source of initialization state. Regardless, doing so would save a pass through alloc/dealloc in the collision [cached] case.

Assuming nothing else uses UniquingHashTable(), move the static declaration of that = the dispatch_once() initializer into your initWithStuff: (or factory) method. It makes the code clearer and less fragile (in that nothing can muck with the hashtable externally).

Use a serial GCD queue for your serialization in the cache lookup code; queues are really cheap and often don't have to cross the user->kernel barrier to enqueue and execute a block.

The hashtable will be using a combination of hash and isEqual: to exactly determine object equality (an implementation detail, really... since a weak hashtable uses Object personality, isEqual: will be used as the final say to determine equality). Thus, hash collisions should not be an issue. If you want to ensure that two of your objects with different addresses are not equal, implement your isEqual: accordingly (where two objects that are equal must have the same hash, two inequal objects may have the same hash value).


The implementation of isEqual: generally first tests pointer equality and, only on failure, does it go down the required rabbit hole of testing internal state to ensure equality.

If you have a custom isEqual:, it should do the same.

Your code (as originally implemented) is already incurring overhead of a malloc and free (via +alloc and -dealloc) in the collision case. That would be a seemingly huge amount of overhead compared to a few invocations of isEqual: (assuming your hash table isn't hugely collision heavy).

How are you measuring the overhead of isEqual: vs. pointer comparison?

like image 109
bbum Avatar answered Oct 13 '22 10:10

bbum