Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lockless Reader/Writer Synchronization In MVCC Implementation

Tags:

c++

c++11

I've been obsessing over the correctness of some lockless code, and I would really appreciate any input I can get. My question is about how to achieve some required inter-thread synchronization using acquire & release semantics in C++11's memory model. Before my question, some background...

In MVCC, a writer can install a new version of an object without affecting readers of old object versions. However, if a writer installs a new version of an object when a reader with higher-numbered timestamp has already acquired a reference to an older version, the writer transaction will have to be rolled back & retried. This is to preserve serializable snapshot isolation (so it's as if all successful transactions executed one after the other in timestamp order). Readers never have to retry due to writes, but writers may have to be rolled back & retried if their activity would "pull the rug out from under" readers with higher-numbered timestamps. To implement this constraint, a read-timestamp is used. The idea is that a reader updates the read-timestamp of an object to its own timestamp before acquiring a reference, and a writer will check the read-timestamp to see whether it's OK to proceed with a new version of that object.

Suppose there are two transactions: T1 (the writer) and T2 (the reader) which are running in separate threads.

T1 (the writer) does this:

void
DataStore::update(CachedObject* oldObject, CachedObject* newObject)
{
    .
    .
    .
    COcontainer* container = oldObject->parent();
    tid_t newTID = newObject->revision();
    container->setObject(newObject);
    tid_t* rrp = &container->readRevision;
    tid_t rr = __atomic_load_n(rrp, __ATOMIC_ACQUIRE);
    while (true)
    {
        if (rr > newTID) throw TransactionRetryEx();
        if (__atomic_compare_exchange_n(
            rrp,
            &rr,
            rr,
            false,
            __ATOMIC_RELEASE,
            __ATOMIC_RELAXED)
        {
            break;
        }
    }
}

T2 (the reader) does this:

CachedObject*
Transaction::onRead(CachedObject* object)
{
    tid_t tid = Transaction::mine()->tid();
    COcontainer* container = object->parent();
    tid_t* rrp = &container->readRevision;
    tid_t rr = __atomic_load_n(rrp, __ATOMIC_ACQUIRE);
    while (rr < tid)
    {
        if (__atomic_compare_exchange_n(
            rrp,
            &rr,
            tid,
            false,
            __ATOMIC_ACQUIRE,
            __ATOMIC_ACQUIRE))
        {
            break;
        }
    }
    // follow the chain of objects to find the newest one this transaction can use
    object = object->newest();
    // caller can use object now
    return object;
}

This is a simple summary of the situation I'm worried about:

     A    B    C
<----*----*----*---->
   timestamp order

A: old object's timestamp
B: new object's timestamp (T1's timestamp)
C: "future" reader's timestamp (T2's timestamp)

* If T2@C reads object@A, T1@B must be rolled back.

If T1 is fully executed before T2 commences (and T1's effects are fully visible to T2), then there's no problem. T2 will acquire a reference to the object version installed by T1, which it can use because T1's timestamp is less than T2's. (A transaction can read objects "from the past" but it cannot "peer into the future").

If T2 is fully executed before T1 commences (and T2's effects are fully visible to T1), then there's no problem. T1 will see that a transaction "from the future" has potentially read the older version of the object. Therefore, T1 will be rolled back and a new transaction will be created to retry performance of the work.

The trouble (of course) is guaranteeing correct behavior when T1 and T2 run concurrently. It would be very simple to just use a mutex to eliminate race conditions, but I would only accept a solution with locks if I was convinced there was no other way. I'm pretty sure it should be possible to do this with the acquire & release memory models of C++11. I'm OK with some complexity, so long as I can be satisfied that the code is correct. I really want readers to run as fast as possible, which is a main selling feature of MVCC.

Questions:

1. Looking at the above (partial) code, do you think a race condition exists, so that T1 could fail to be rolled back (via throw TransactionRetryEx()) in a case where T2 proceeds to use the older version of the object?

2. If the code is wrong, explain why and please provide general guidance for making it right.

3. Even if the code looks right, can you see how it can be more efficient?

My reasoning in DataStore::update() is that if the call to __atomic_compare_exchange_n() succeeds, it means the "conflicting" reader thread hasn't yet updated the read-timestamp, and therefore it also hasn't traversed the chain of object versions to find the newly accessible version that was just installed.

I'm about to buy the book "Transactional Information Systems: Theory, Algorithms, and the Practice of Concurrency Control and Recovery", but I thought I'd also bother you :D I think I should have bought the book sooner, but I'm also fairly sure I won't learn anything that would invalidate a large chunk of my work.

I hope I've given enough information to make an answer possible. I'll gladly edit my question to make it more clear if I receive constructive criticism of it. If this question (or one much like it) has already been asked & answered, that would be great.

Thanks!

like image 264
Adam McKee Avatar asked Sep 12 '12 08:09

Adam McKee


1 Answers

This is complicated, I can't say anything to 1. and 2. yet, but regarding 3 I've noticed something:

When __atomic_compare_exchange_n returns false, then the current value of *rrp is written into rr, so the __atomic_load()s within the loops are both redundant (in T2 just throw it out, in T1 do it once before the loop like in T2).

As general remark, it's probably not necessary to think about acquire/release until everything else in the algorithm is finished; then you can check to see how strong a memory barrier is required "everywhere".

like image 102
Seg Fault Avatar answered Oct 20 '22 19:10

Seg Fault