Which of these implementations of seqlock are correct?

Question

I am studying the implementation of Seqlock. However all sources I found implement them differently.

Linux Kernel

Linux kernel implements it like this:

static inline unsigned __read_seqcount_begin(const seqcount_t *s)
{
    unsigned ret;

repeat:
    ret = READ_ONCE(s->sequence);
    if (unlikely(ret & 1)) {
        cpu_relax();
        goto repeat;
    }
    return ret;
}

static inline unsigned raw_read_seqcount_begin(const seqcount_t *s)
{
    unsigned ret = __read_seqcount_begin(s);
    smp_rmb();
    return ret;
}

Basically, it uses a volatile read plus a read barrier with acquire semantics on the reader side.

When used, subsequent reads are unprotected:

struct Data {
    u64 a, b;
};

// ...
read_seqcount_begin(&seq);
int v1 = d.a, v2 = d.b;
// ...

rigtorp/Seqlock

RIGTORP_SEQLOCK_NOINLINE T load() const noexcept {
  T copy;
  std::size_t seq0, seq1;
  do {
    seq0 = seq_.load(std::memory_order_acquire);
    std::atomic_signal_fence(std::memory_order_acq_rel);
    copy = value_;
    std::atomic_signal_fence(std::memory_order_acq_rel);
    seq1 = seq_.load(std::memory_order_acquire);
  } while (seq0 != seq1 || seq0 & 1);
  return copy;
}

The load of data is still performed without an atomic operation or protection. However, an atomic_signal_fence with acquire-release semantics is added prior to the read, in contrast to the rmb with acquire semantics in Kernel.

Amanieu/seqlock (Rust)

pub fn read(&self) -> T {
    loop {
        // Load the first sequence number. The acquire ordering ensures that
        // this is done before reading the data.
        let seq1 = self.seq.load(Ordering::Acquire);

        // If the sequence number is odd then it means a writer is currently
        // modifying the value.
        if seq1 & 1 != 0 {
            // Yield to give the writer a chance to finish. Writing is
            // expected to be relatively rare anyways so this isn't too
            // performance critical.
            thread::yield_now();
            continue;
        }

        // We need to use a volatile read here because the data may be
        // concurrently modified by a writer.
        let result = unsafe { ptr::read_volatile(self.data.get()) };

        // Make sure the seq2 read occurs after reading the data. What we
        // ideally want is a load(Release), but the Release ordering is not
        // available on loads.
        fence(Ordering::Acquire);

        // If the sequence number is the same then the data wasn't modified
        // while we were reading it, and can be returned.
        let seq2 = self.seq.load(Ordering::Relaxed);
        if seq1 == seq2 {
            return result;
        }
    }
}

No memory barrier between loading seq and data, but instead a volatile read is used here.

Can Seqlocks Get Along with Programming Language Memory Models? (Variant 3)

T reader() {
  int r1, r2;
  unsigned seq0, seq1;
  do {
    seq0 = seq.load(m_o_acquire);
    r1 = data1.load(m_o_relaxed);
    r2 = data2.load(m_o_relaxed);
    atomic_thread_fence(m_o_acquire);
    seq1 = seq.load(m_o_relaxed);
  } while (seq0 != seq1 || seq0 & 1);
  // do something with r1 and r2;
}

Similar to the Rust implementation, but atomic operations instead of volatile_read are used on data.

Arguments in P1478R1: Byte-wise atomic memcpy

This paper claims that:

In the general case, there are good semantic reasons to require that all data accesses inside such a seqlock "critical section" must be atomic. If we read a pointer p as part of reading the data, and then read *p as well, the code inside the critical section may read from a bad address if the read of p happened to see a half-updated pointer value. In such cases, there is probably no way to avoid reading the pointer with a conventional atomic load, and that's exactly what's desired.

However, in many cases, particularly in the multiple process case, seqlock data consists of a single trivially copyable object, and the seqlock "critical section" consists of a simple copy operation. Under normal circumstances, this could have been written using memcpy. But that's unacceptable here, since memcpy does not generate atomic accesses, and is (according to our specification anyway) susceptable to data races.

Currently to write such code correctly, we need to basically decompose such data into many small lock-free atomic subobjects, and copy them a piece at a time. Treating the data as a single large atomic object would defeat the purpose of the seqlock, since the atomic copy operation would acquire a conventional lock. Our proposal essentially adds a convenient library facility to automate this decomposition into small objects.

My question

Which of the above implementations are correct? Which are correct but inefficient?
Can the volatile_read be reordered before the acquire-read of seqlock?

Support Ukraine · Accepted Answer

Your qoutes from Linux seems wrong.

According to https://www.kernel.org/doc/html/latest/locking/seqlock.html the read process is:

Read path:

do {
        seq = read_seqcount_begin(&foo_seqcount);

        /* ... [[read-side critical section]] ... */

} while (read_seqcount_retry(&foo_seqcount, seq));

If you look at the github link posted in the question, you'll find a comment including nearly the same process.

It seems that you are only looking into one part of the read process. The linked file implements what you need to implement readers and writers but not the reader/writer them self.

Also notice this comment from the top of the file:

* The seqlock seqcount_t interface does not prescribe a precise sequence of
* read begin/retry/end. For readers, typically there is a call to
* read_seqcount_begin() and read_seqcount_retry(), however, there are more
* esoteric cases which do not follow this pattern.

Which of these implementations of seqlock are correct?

Tags:

c

atomic

rust

memory-barriers

seqlock

Linux Kernel

rigtorp/Seqlock

Amanieu/seqlock (Rust)

Can Seqlocks Get Along with Programming Language Memory Models? (Variant 3)

Arguments in P1478R1: Byte-wise atomic memcpy

My question

lz96

1 Answers

Support Ukraine

Recent Activity

Donate For Us

Which of these implementations of seqlock are correct?

Tags:

c

atomic

rust

memory-barriers

seqlock

Linux Kernel

rigtorp/Seqlock

Amanieu/seqlock (Rust)

Can Seqlocks Get Along with Programming Language Memory Models? (Variant 3)

Arguments in P1478R1: Byte-wise atomic memcpy

My question

lz96

1 Answers

Support Ukraine

Related questions

Recent Activity

Donate For Us