Lazy, but persisted, evaluation of java 8 lambda

Tags:

java-8

I am currently working on creating my own persistent array in java, which uses a binary search tree to store an collection of values.

I want to add a map method that takes a Function as an argument to generate a new array. I don't want to evaluate the functions unless the particular value is requested. This is pretty simple to do as lambdas are lazy evaluated. However, I also only want a function to be evaluated only once, even if the result is requested multiple times.

I could create a node that stores the supplier and updates the result when evaluated:

class Node<T> {

    private T value;
    private Supplier<T> supplier;

    public T get() {
        if (null != value)
            return value;
        value = supplier.get();
        return value;
    }
}

...where supplier is derived from a Function being applied to a value in an older version of the persisted array.

However, this is no longer a functional approach, and has the potential to cause errors in a multi-threaded system*. It also doesn't yield an advantage in the case where supplier returns a null value**.

Another approach would be return an instance of Node on a get call:

class Node<T> {

    private final Optional<T> value;
    private final Supplier<T> supplier;

    Node(Supplier<T> supplier, T value) {
        this.supplier = supplier;
        this.value = Optional.ofNullable(value);
    }

    public Tuple<Node<T>, T> get() {
        if (null != value)
            return new Tuple<>(this, value.orElse(null));
        T result = supplier.get();
        Node<T> newNode = new Node<>(null, result);
        return new Tuple<>(newNode, result);
    }
}

I like this approach for staying functional; but it would require a lot of overhead in all the parent nodes going up the tree for just a get call. And it would require cumbersome unboxing in the using application's code.

Does anyone have another approach they can think of to make this work the way I'm asking? Thanks.

*This could be solved using locking mechanisms, but adds a layer of complexity I'm hoping to avoid.

**I've thought about making value an Optional<T>, where null means hasn't been evaluate, and Optional.empty() as has been evaluated and returns a null. However, this works around my issue, rather than solving it.

For anyone not familiar with a persisted array, it's a data structure that creates a new instance of itself whenever an update is performed. This allows it to be purely immutable. Using a binary tree (or more common 32-bit tree) method allows the updates to reduce duplicating data, both in speed and memory.

EDIT:

The code for the collection can be found on github. For description of use, you can look in the test folder.

776

asked Jun 12 '17 13:06

JRogerC

2 Answers

Disclaimer: this answer doesn't respond the question directly, as it doesn't use neither Supplier nor Optional directly in the Node class. Instead, a generic functional programming technique is presented that might help solve the problem.

If the problem is all about evaluating the function only once for each input value, then you shouldn't change your tree/array/nodes. Instead, memoize the function, which is a pure functional approach:

In computing, memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again

Here's a way to do it, inspired by this excellent article written by Pierre-Yves Saumont (please check it for an in-depth introduction to memoization):

public static <T, U> Function<T, U> memoize(Function<T, U> function) {
    Map<T, U> cache = new ConcurrentHashMap<>();
    return input -> cache.computeIfAbsent(input, function::apply);
}

Suppose you have a method that takes quite long to execute. Then, you could use the memoize method this way:

// This method takes quite long to execute
Integer longCalculation(Integer x) {
    try {
        Thread.sleep(1_000);
    } catch (InterruptedException ignored) {
    }
    return x * 2;
}

// Our function is a method reference to the method above
Function<Integer, Integer> function = this::longCalculation;

// Now we memoize the function declared above
Function<Integer, Integer> memoized = memoize(function);

Now, if you call:

int result1 = function.apply(1);
int result2 = function.apply(2);
int result3 = function.apply(3);
int result4 = function.apply(2);
int result5 = function.apply(1);

You'll notice that the five calls take ~5 seconds altogether (1 second for each call).

However, if you use the memoized function with the same input values 1 2 3 2 1:

int memoizedResult1 = memoized.apply(1);
int memoizedResult2 = memoized.apply(2);
int memoizedResult3 = memoized.apply(3);
int memoizedResult4 = memoized.apply(2); // <-- returned from cache
int memoizedResult5 = memoized.apply(1); // <-- returned from cache

You'll notice that now the five calls take ~3 seconds altogether. This is because the last two results are immediately returned from the cache.

So, back to your structure... Inside your map method, you could just memoize the given function and use the returned memoized function instead. Internally, this will cache the function's return values in the ConcurrentHashMap.

As the memoize method uses a ConcurrentHashMap internally, you don't need to worry about concurrency.

Note: This is just the beginning... I'm thinking about two possible improvements here. One would be to limit the size of the cache, so that it doesn't take the whole memory if the domain of the given function is too big. The other improvement would be to memoize the given function only if it hasn't been memoized previously. But these details are left as an exercise for the reader... ;)

191

answered Oct 12 '22 22:10

fps

I also only want a function to be evaluated only once, even if the result is requested multiple times.

How about this?

class Node<T> {
    private Supplier<T> supplier;

    Node(T value, Supplier<T> supplier) {
        this.supplier = sync(lazy(value, supplier));
    }

    public T get() {
        return supplier.get();
    }
}

the sync method only synchronizing the Supplier once, when the target is called, the lock is disabled for the next continuous requests:

static <T> Supplier<T> sync(Supplier<T> target) {
    return sync(new ReentrantLock(), target);
}

static <T> Supplier<T> sync(ReentrantLock lock, Supplier<T> target) {
    //     v--- synchronizing for multi-threads once
    return lazy(() -> {
        // the interesting thing is that when more than one threads come in here
        // but target.delegate is only changed once
        lock.lock();
        try {  
            return target.get();
        } finally {
            lock.unlock();
        }
    });
}

the lazy method calls a given Supplier only once as below:

static <T> Supplier<T> lazy(T value, Supplier<T> defaults) {
    return lazy(() -> value != null ? value : defaults.get());
}

static <T> Supplier<T> lazy(Supplier<T> target) {
    return new Supplier<T>() {
        private volatile Supplier<T> delegate = () -> {
            T it = target.get();
            //v--- return the evaluated value in turn
            delegate = () -> it;
            return it;
        };

        @Override
        public T get() {
            return delegate.get();
        }
    };
}

IF you interested in how the final code was made, I have commit the code into github, you can just copy and use it. and you can found I have renamed lazy methods to once, which is more expressiveness.

answered Oct 12 '22 23:10

holi-java

Related questions
                            
                                Avoid isPresent() and get() in control logic
                            
                                max_element in lambda function [closed]
                            
                                How to specify anonymous object as generic parameter?
                            
                                Return interface implementation with Kotlin and lambda
                            
                                Behavior with Kotlin Higher-Order Functions and Single-method interfaces?
                            
                                Apply pattern on file stream
                            
                                With C++ lambdas, what are the rules for capturing references by reference?
                            
                                Java 8 streams reduce and combine List items to Map
                            
                                Java 8 streams: count values
                            
                                convert list of object into a map of <String,Map<String,Integer>> java8 streams
                            
                                External argument to method reference in Java 8
                            
                                Type-only template argument to lambda
                            
                                Why Java's Optional doesn't call Consumer in ifPresent()?
                            
                                Lambda capture by value and the "mutable" keyword
                            
                                use if-else statement in java-8 lambda expression
                            
                                passing and enforcing a member function in java
                            
                                How to use DisposableObserver with lambda expressions in RxJava2
                            
                                NPoco: Update some (but not all) columns using Expression notation
                            
                                Java lambdas: Copy nodes from list to a new list
                            
                                How can I upload a 'file' to S3 by creating a temp file, using AWS Lambda?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With