Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is false sharing an issue if the variable being modified by a thread is marked as volatile

I've been looking at the Martin Thompson article. Which is an explanation of false sharing.

http://mechanical-sympathy.blogspot.co.uk/2011/07/false-sharing.html

    public final class FalseSharing
    implements Runnable
    {
        public final static int NUM_THREADS = 4; // change
        public final static long ITERATIONS = 500L * 1000L * 1000L;
        private final int arrayIndex;

        private static VolatileLong[] longs = new VolatileLong[NUM_THREADS];


        static
        {    
            for (int i = 0; i < longs.length; i++)
            {
                longs[i] = new VolatileLong();
            }
        }

        public FalseSharing(final int arrayIndex)
        {
            this.arrayIndex = arrayIndex;
        }

        public static void main(final String[] args) throws Exception
        {
            final long start = System.nanoTime();
            runTest();
            System.out.println("duration = " + (System.nanoTime() -start));
        }

        private static void runTest() throws InterruptedException
        {
            Thread[] threads = new Thread[NUM_THREADS];

            for (int i = 0; i < threads.length; i++)
            {
                threads[i] = new Thread(new FalseSharing(i));
            }

            for (Thread t : threads)
            {
                t.start();
            }

            for (Thread t : threads)
            {
                t.join();
            }
        }

        public void run()
        {
            long i = ITERATIONS + 1;
            while (0 != --i)
            {
                longs[arrayIndex].value = i;
            }
        }

        public final static class VolatileLong
        {
            public volatile long value = 0L;
            public long p1, p2, p3, p4, p5, p6; // comment out
        }
    }

The example demonstrates the slow down experienced by multiple threads invalidating the cache line of each other even though there each only updating one variable exclusively.

BlockqFigure 1. above illustrates the issue of false sharing. A thread running on core 1 wants to update variable X while a thread on core 2 wants to update variable Y. Unfortunately these two hot variables reside in the same cache line. Each thread will race for ownership of the cache line so they can update it. If core 1 gets ownership then the cache sub-system will need to invalidate the corresponding cache line for core 2. When Core 2 gets ownership and performs its update, then core 1 will be told to invalidate its copy of the cache line. This will ping pong back and forth via the L3 cache greatly impacting performance. The issue would be further exacerbated if competing cores are on different sockets and additionally have to cross the socket interconnect.

My question is the following. If all the variables being updated are volatile, why does this padding cause a performance increase? My understanding is that a volatile variable always writes and reads through to main memory. Therefore I'd assume that every write and read to any variable in this example will result in a flush of the current cores cache line.

So according to my understanding. If thread one invalidates thread two's cacheline, this will not become apparant to thread two until it goes to read a value from its own cache line. The value it's reading is a volatile value so this effectively renders the cache dirty anyway resulting in a read from main memory.

Where have I gone wrong in my understanding?

Thanks

like image 216
David Wales Avatar asked Jul 02 '15 11:07

David Wales


People also ask

Why does false sharing happen?

False sharing occurs when threads on different processors modify variables that reside on the same cache line.

What is false sharing and how can we avoid it?

In general, false sharing can be reduced using the following techniques: Make use of private or threadprivate data as much as possible. Use the compiler's optimization features to eliminate memory loads and stores. Pad data structures so that each thread's data resides on a different cache line.

What is false sharing in the context of multithreading?

"false sharing" is something that happens in (some) cache systems when two threads (or rather two cores) writes to two different variables that belongs to the same cache line.

Does declaring a variable as volatile ensures thread safety?

Unlike synchronized methods or blocks, it does not make other threads wait while one thread is working on a critical section. Therefore, the volatile keyword does not provide thread safety when non-atomic operations or composite operations are performed on shared variables.


1 Answers

If all the variables being updated are volatile, why does this padding cause a performance increase?

So there are two things going on here:

  1. We are dealing with an array of VolatileLong objects with each thread working on their own VolatileLong. (See private final int arrayIndex).
  2. Each of the VolatileLong object has a single volatile field.

The volatile access means that the threads have to both invalidate the cache "line" that holds their volatile long value and they need to lock that cache line to update it. As the article states, a cache line is typically ~64 bytes or so.

The article is saying that by adding padding to the VolatileLong object, it moves the object that each of the threads is locking into different cache lines. So even though the different threads are still crossing memory barriers as they assign their volatile long value, they are in a different cache line an so won't cause excessive L2 cache bandwidth.

In summary, the performance increase happens because even though the threads are still locking their cache line to update the volatile field, these locks are now on different memory blocks and so they are not clashing with the other threads' locks and causing cache invalidations.

like image 55
Gray Avatar answered Nov 02 '22 08:11

Gray