Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lazy initialization / memoization without volatile

It appears the Java Memory Model does not define "refreshing" and "flushing" of the local cache, instead people only call it that way for simplicity, but actually the "happens-before" relationship implies refreshing and flushing somehow (would be great if you can explain that, but not directly part of the question).

This is getting me really confused combined with the fact that the section about the Java Memory Model in the JLS is not written in a way which makes it easy to understand.

Therefore could you please tell me if the assumptions I made in the following code are correct and if it is therefore guaranteed to run correctly?

It is partially based on the code provided in the Wikipedia article on Double-checked locking, however there the author used a wrapper class (FinalWrapper), but the reason for this is not entirely obvious to me. Maybe to support null values?

public class Memoized<T> {
    private T value;
    private volatile boolean _volatile;
    private final Supplier<T> supplier;

    public Memoized(Supplier<T> supplier) {
        this.supplier = supplier;
    }

    public T get() {
        /* Apparently have to use local variable here, otherwise return might use older value
         * see https://jeremymanson.blogspot.com/2008/12/benign-data-races-in-java.html
         */
        T tempValue = value;

        if (tempValue == null) {
            // Refresh
            if (_volatile);
            tempValue = value;

            if (tempValue == null) {
                // Entering refreshes, or have to use `if (_volatile)` again?
                synchronized (this) {
                    tempValue = value;

                    if (tempValue == null) {
                        value = tempValue = supplier.get();
                    }

                    /* 
                     * Exit should flush changes
                     * "Flushing" does not actually exists, maybe have to use  
                     * `_volatile = true` instead to establish happens-before?
                     */
                }
            }
        }

        return tempValue;
    }
}

Also I have read that the constructor call can be inlined and reordered resulting in a reference to an uninitialized object (see this comment on a blog). Is it then safe to directly assign the result of the supplier or does this have to be done in two steps?

value = tempValue = supplier.get();

Two steps:

tempValue = supplier.get();
// Reorder barrier, maybe not needed?
if (_volatile);
value = tempValue;

Edit: The title of this question is a little bit misleading, the goal was to have reduced usage of a volatile field. If the initialized value is already in the cache of a thread, then value is directly accessed without the need to look in the main memory again.

like image 352
Marcono1234 Avatar asked Jan 08 '19 21:01

Marcono1234


2 Answers

You can reduce usage of volatile if you have only a few singletons. Note: you have to repeat this code for each singleton.

enum LazyX {
   ;
   static volatile Supplier<X> xSupplier; // set somewhere before use

   static class Holder {
       static final X x = xSupplier.get();
   }

   public static X get() {
       return Holder.x;
   }
}

If you know the Supplier, this becomes simpler

enum LazyXpensive {
   ;

   // called only once in a thread safe manner
   static final Xpensive x = new Xpensive();

   // after class initialisation, this is a non volatile read
   public static Xpensive get() {
       return x;
   }
}

You can avoid making the field volatile by using Unsafe

import sun.misc.Unsafe;

import java.lang.reflect.Field;
import java.util.function.Supplier;

public class LazyHolder<T> {
    static final Unsafe unsafe = getUnsafe();
    static final long valueOffset = getValueOffset();

    Supplier<T> supplier;
    T value;

    public T get() {
        T value = this.value;
        if (value != null) return value;

        return getOrCreate();
    }

    private T getOrCreate() {
        T value;
        value = (T) unsafe.getObjectVolatile(this, valueOffset);
        if (value != null) return value;

        synchronized (this) {
            value = this.value;
            if (value != null) return value;
            this.value = supplier.get();
            supplier = null;
            return this.value;
        }
    }


    public static Unsafe getUnsafe() {
        try {
            Field theUnsafe = Unsafe.class.getDeclaredField("theUnsafe");
            theUnsafe.setAccessible(true);
            return (Unsafe) theUnsafe.get(null);
        } catch (NoSuchFieldException | IllegalAccessException e) {
            throw new AssertionError(e);
        }
    }

    private static long getValueOffset() {
        try {
            return unsafe.objectFieldOffset(LazyHolder.class.getDeclaredField("value"));
        } catch (NoSuchFieldException e) {
            throw new AssertionError(e);
        }
    }
}

However, having the extra look up is a micro optimisation. If you are willing to take a synchronisation hit once per thread, you can avoid using volatile at all.

like image 99
Peter Lawrey Avatar answered Oct 13 '22 00:10

Peter Lawrey


Your code is not thread safe, which can easily be shown by stripping off all irrelevant parts:

public class Memoized<T> {
    private T value;
    // irrelevant parts omitted

    public T get() {
        T tempValue = value;

        if (tempValue == null) {
            // irrelevant parts omitted
        }

        return tempValue;
    }
}

So value has no volatile modifier and you’re reading it within the get() method without synchronization and when non-null, proceed using it without any synchronization.

This code path alone is already making the code broken, regardless of what you are doing when assigning value, as all thread safe constructs require both ends, reading and writing sides, to use a compatible synchronization mechanism.

The fact that you are using esoteric constructs like if (_volatile); becomes irrelevant then, as the code is already broken.

The reason why the Wikipedia example uses a wrapper with a final field is that immutable objects using only final fields are immune to data races and hence, the only construct that is safe when reading its reference without a synchronization action.

Note that since lambda expressions fall into the same category, you can use them to simplify the example for your use case:

public class Memoized<T> {
    private boolean initialized;
    private Supplier<T> supplier;

    public Memoized(Supplier<T> supplier) {
        this.supplier = () -> {
            synchronized(this) {
                if(!initialized) {
                    T value = supplier.get();
                    this.supplier = () -> value;
                    initialized = true;
                }
            }
            return this.supplier.get();
        };
    }

    public T get() {
        return supplier.get();
    }
}

Here, supplier.get() within Memoized.get() may read an updated value of supplier without synchronization action, in which case it will read the correct value, because it is implicitly final. If the method reads an outdated value for the supplier reference, it will end up at the synchronized(this) block which uses the initialized flag to determine whether the evaluation of the original supplier is necessary.

Since the initialized field will only be accessed within the synchronized(this) block, it will always evaluate to the correct value. This block will be executed at most once for every thread, whereas only the first one will evaluate get() on the original supplier. Afterwards, each thread will use the () -> value supplier, returning the value without needing any synchronization actions.

like image 27
Holger Avatar answered Oct 13 '22 00:10

Holger