Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to prevent corruption in concurrent lifo stack implemented with atomic compare and swap

Below is a simplified C program that demonstrates a problem I am having with a concurrent stack implemented using the GNU built in compare and swap on an intel cpu. It took me a while to understand what was happening, but now that I do I see that it is well within the guarantees provided by atomic compare and swap.

When a node is popped from the stack, modified, then placed back on the stack, the modified value may become the new head of the stack, corrupting it. The comments in test_get describe the order of events that cause this.

Is there any way to reliably use the same node with the same stack more than once? This is an exaggerated example but even returning an unmodified node to gHead could leak other nodes. The original point of this data structure was to repeatedly use the same nodes.

typedef struct test_node {
    struct test_node *next;
    void *data;
} *test_node_p;

test_node_p gHead = NULL;
unsigned gThreadsDone = 0;

void test_put( test_node_p inMemory ) {
    test_node_p scratch;

    do {
        scratch = gHead;
        inMemory->next = scratch;
    } while ( !__sync_bool_compare_and_swap( &gHead , scratch , inMemory ) );
}

test_node_p test_get( void ) {
    test_node_p result;
    test_node_p scratch;

    do {
        result = gHead;
        if ( NULL == result ) break;
        //  other thread acquires result, modifies next
        scratch = result->next;     //  scratch is 0xDEFACED...
        //  other thread returns result to gHead
    } while ( !__sync_bool_compare_and_swap( &gHead , result , scratch ) );
    //  this thread corrupts gHead with 0xDEFACED... value

    if ( NULL == result ) {
        result = (test_node_p)malloc( sizeof(struct test_node) );
    }

    return result;
}

void *memory_entry( void *in ) {
    test_node_p node;
    int index , count = 1000;
    for ( index = 0 ; index < count ; ++index ) {
        node = test_get();
        *(uint64_t *)(node) |= 0xDEFACED000000000ULL;
        test_put( node );
    }

    __sync_add_and_fetch(&gThreadsDone,1);

    return NULL;
}

void main() {
    unsigned    index , count = 8;
    pthread_t   thread;

    gThreadsDone = 0;

    for ( index = 0 ; index < count ; ++index ) {
        pthread_create( &thread , NULL , memory_entry , NULL );
        pthread_detach( thread );
    }

    while ( __sync_add_and_fetch(&gThreadsDone,0) < count ) {}
}

I am running this test with 8 logical cores. When I use only 4 threads the problem rarely occurs but with 8 it is easily reproducible. I have a MacBook with an Intel Core i7.

I am not interested in using some library or framework that has solved this, I want to know how it was solved. Thank you.

Edit:

Here are two solutions that do not work in my case.

Some architectures provide ll/sc instruction pairs that perform atomic tests on the address instead of the value. Any write to the address, even of the same value, prevents success.

Some architectures provide larger than pointer size compare and swap. With this, a monotonic counter is paired with the pointer which is atomically incremented every time it is used so the value always changes. Some intel chips support this but there is no GNU wrapper.

Here is a play by play of the problem. Two threads, A and B, reach the point in test_get where they have the same value for result and it is not NULL. Then the following sequence occurs:

  1. Thread A passes the compare and swap and returns result from test_get
  2. Thread A modifies the content of result
  3. Thread B dereferences result, getting whatever thread A put there
  4. Thread A finishes with result and calls test_put
  5. Thread A passes the compare and swap in test_put putting result back on gHead
  6. Thread B reaches the compare and swap in test_get and passes
  7. Thread B has now corrupted gHead with the value from Thread A

Here is a similar scenario with three threads that does not require Thread A to modify result.

  1. Thread A passes the compare and swap and returns result from test_get
  2. Thread A does not modify the content of result
  3. Thread B dereferences result, getting a valid value in scratch
  4. Thread C calls test_put with an unrelated value and succeeds
  5. Thread A calls test_put and succeeds in putting result back on gHead
  6. Thread B reaches the compare and swap in test_get and passes
  7. Thread B has now corrupted gHead by leaking whatever thread C added

In either scenario, the problem is that thread A passes the compare and swap twice, once for get then again for put, before thread B reaches the compare and swap for get. Any value thread B has for scratch should be discarded, but is not because the value in gHead appears correct.

Any solution that makes this less likely without actually preventing it just makes the bug harder to track down.

Note that the scratch variable is just a name for wherever the dereferenced value of result is placed before the atomic instruction begins. Removing the name does remove the time slice between dereference and compare that may be interrupted.

Note also that atomic means it succeeds or fails as a whole. Any aligned read of a pointer is implicitly atomic on the hardware in question. As far as other threads are concerned, there is no interruptible point where only half the pointer has been read.

like image 846
drawnonward Avatar asked Mar 23 '23 11:03

drawnonward


1 Answers

Never access an atomic variable through a simple evaluation. Also, for a compare and swap loop like yours, __sync_val_compare_and_swap is more convenient, I think.

/* read the head atomically */
result = __sync_val_compare_and_swap(&gHead, 0, 0);
/* compare and swap until we succeed placing the next field */
while ((oldval = result)) {
   result = __sync_val_compare_and_swap(&gHead, result, result->next);
   if (oldval == result) break;
}
like image 73
Jens Gustedt Avatar answered Apr 05 '23 23:04

Jens Gustedt