Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lazy, but persisted, evaluation of java 8 lambda

Tags:

lambda

java-8

I am currently working on creating my own persistent array in java, which uses a binary search tree to store an collection of values.

I want to add a map method that takes a Function as an argument to generate a new array. I don't want to evaluate the functions unless the particular value is requested. This is pretty simple to do as lambdas are lazy evaluated. However, I also only want a function to be evaluated only once, even if the result is requested multiple times.

I could create a node that stores the supplier and updates the result when evaluated:

class Node<T> {

    private T value;
    private Supplier<T> supplier;

    public T get() {
        if (null != value)
            return value;
        value = supplier.get();
        return value;
    }
}

...where supplier is derived from a Function being applied to a value in an older version of the persisted array.

However, this is no longer a functional approach, and has the potential to cause errors in a multi-threaded system*. It also doesn't yield an advantage in the case where supplier returns a null value**.

Another approach would be return an instance of Node on a get call:

class Node<T> {

    private final Optional<T> value;
    private final Supplier<T> supplier;

    Node(Supplier<T> supplier, T value) {
        this.supplier = supplier;
        this.value = Optional.ofNullable(value);
    }

    public Tuple<Node<T>, T> get() {
        if (null != value)
            return new Tuple<>(this, value.orElse(null));
        T result = supplier.get();
        Node<T> newNode = new Node<>(null, result);
        return new Tuple<>(newNode, result);
    }
}

I like this approach for staying functional; but it would require a lot of overhead in all the parent nodes going up the tree for just a get call. And it would require cumbersome unboxing in the using application's code.

Does anyone have another approach they can think of to make this work the way I'm asking? Thanks.

*This could be solved using locking mechanisms, but adds a layer of complexity I'm hoping to avoid.

**I've thought about making value an Optional<T>, where null means hasn't been evaluate, and Optional.empty() as has been evaluated and returns a null. However, this works around my issue, rather than solving it.

For anyone not familiar with a persisted array, it's a data structure that creates a new instance of itself whenever an update is performed. This allows it to be purely immutable. Using a binary tree (or more common 32-bit tree) method allows the updates to reduce duplicating data, both in speed and memory.

EDIT:

The code for the collection can be found on github. For description of use, you can look in the test folder.

like image 776
JRogerC Avatar asked Jun 12 '17 13:06

JRogerC


People also ask

Why Lambda is lazy loading?

Lambdas allows Java to be lazy, as they represent a function to be executed that can be passed around and will only be evaluated when required. The examples below are pretty simple but should demonstrate the order of execution, showing that the lambdas are not executed straight away.

Does Java have lazy evaluation?

Lazy evaluation in Java. Lazy evaluation is another functional programming technique that is not new to Java 8. This technique delays the evaluation of an expression until its value is needed. In most cases, Java eagerly evaluates an expression that is bound to a variable.

Which technique used for catching the value in lazy evaluation?

Lazy Evaluation using Python The range method in Python follows the concept of Lazy Evaluation. It saves the execution time for larger ranges and we never require all the values at a time, so it saves memory consumption as well.

Are lambdas slower Java?

algorithm - Java lambdas 20 times slower than anonymous classes - Stack Overflow. Stack Overflow for Teams – Start collaborating and sharing organizational knowledge.


2 Answers

Disclaimer: this answer doesn't respond the question directly, as it doesn't use neither Supplier nor Optional directly in the Node class. Instead, a generic functional programming technique is presented that might help solve the problem.


If the problem is all about evaluating the function only once for each input value, then you shouldn't change your tree/array/nodes. Instead, memoize the function, which is a pure functional approach:

In computing, memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again

Here's a way to do it, inspired by this excellent article written by Pierre-Yves Saumont (please check it for an in-depth introduction to memoization):

public static <T, U> Function<T, U> memoize(Function<T, U> function) {
    Map<T, U> cache = new ConcurrentHashMap<>();
    return input -> cache.computeIfAbsent(input, function::apply);
}

Suppose you have a method that takes quite long to execute. Then, you could use the memoize method this way:

// This method takes quite long to execute
Integer longCalculation(Integer x) {
    try {
        Thread.sleep(1_000);
    } catch (InterruptedException ignored) {
    }
    return x * 2;
}

// Our function is a method reference to the method above
Function<Integer, Integer> function = this::longCalculation;

// Now we memoize the function declared above
Function<Integer, Integer> memoized = memoize(function);

Now, if you call:

int result1 = function.apply(1);
int result2 = function.apply(2);
int result3 = function.apply(3);
int result4 = function.apply(2);
int result5 = function.apply(1);

You'll notice that the five calls take ~5 seconds altogether (1 second for each call).

However, if you use the memoized function with the same input values 1 2 3 2 1:

int memoizedResult1 = memoized.apply(1);
int memoizedResult2 = memoized.apply(2);
int memoizedResult3 = memoized.apply(3);
int memoizedResult4 = memoized.apply(2); // <-- returned from cache
int memoizedResult5 = memoized.apply(1); // <-- returned from cache

You'll notice that now the five calls take ~3 seconds altogether. This is because the last two results are immediately returned from the cache.


So, back to your structure... Inside your map method, you could just memoize the given function and use the returned memoized function instead. Internally, this will cache the function's return values in the ConcurrentHashMap.

As the memoize method uses a ConcurrentHashMap internally, you don't need to worry about concurrency.

Note: This is just the beginning... I'm thinking about two possible improvements here. One would be to limit the size of the cache, so that it doesn't take the whole memory if the domain of the given function is too big. The other improvement would be to memoize the given function only if it hasn't been memoized previously. But these details are left as an exercise for the reader... ;)

like image 191
fps Avatar answered Oct 12 '22 22:10

fps


I also only want a function to be evaluated only once, even if the result is requested multiple times.

How about this?

class Node<T> {
    private Supplier<T> supplier;

    Node(T value, Supplier<T> supplier) {
        this.supplier = sync(lazy(value, supplier));
    }

    public T get() {
        return supplier.get();
    }
}

the sync method only synchronizing the Supplier once, when the target is called, the lock is disabled for the next continuous requests:

static <T> Supplier<T> sync(Supplier<T> target) {
    return sync(new ReentrantLock(), target);
}

static <T> Supplier<T> sync(ReentrantLock lock, Supplier<T> target) {
    //     v--- synchronizing for multi-threads once
    return lazy(() -> {
        // the interesting thing is that when more than one threads come in here
        // but target.delegate is only changed once
        lock.lock();
        try {  
            return target.get();
        } finally {
            lock.unlock();
        }
    });
}

the lazy method calls a given Supplier only once as below:

static <T> Supplier<T> lazy(T value, Supplier<T> defaults) {
    return lazy(() -> value != null ? value : defaults.get());
}

static <T> Supplier<T> lazy(Supplier<T> target) {
    return new Supplier<T>() {
        private volatile Supplier<T> delegate = () -> {
            T it = target.get();
            //v--- return the evaluated value in turn
            delegate = () -> it;
            return it;
        };

        @Override
        public T get() {
            return delegate.get();
        }
    };
}

IF you interested in how the final code was made, I have commit the code into github, you can just copy and use it. and you can found I have renamed lazy methods to once, which is more expressiveness.

like image 22
holi-java Avatar answered Oct 12 '22 23:10

holi-java