Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to safely serialize a lambda?

Although it is possible to serialize a lambda in Java 8, it is strongly discouraged; even serializing inner classes is discouraged. The reason given is that lambdas may not deserialize properly on another JRE. However, doesn't this mean that there is a way to safely serialize a lambda?

For example, say I define a class to be something like this:

public class MyClass {
    private String value;
    private Predicate<String> validateValue;

    public MyClass(String value, Predicate<String> validate) {
        this.value = value;
        this.validateValue = validate;
    }

    public void setValue(String value) {
        if (!validateValue(value)) throw new IllegalArgumentException();
        this.value = value;
    }

    public void setValidation(Predicate<String> validate) {
        this.validateValue = validate;
    }
}

If I declared an instance of the class like this, I should not serialize it:

MyClass obj = new MyClass("some value", (s) -> !s.isEmpty());

But what if I made an instance of the class like this:

// Could even be a static nested class
public class IsNonEmpty implements Predicate<String>, Serializable {
    @Override
    public boolean test(String s) {
        return !s.isEmpty();
    }
}
MyClass isThisSafeToSerialize = new MyClass("some string", new IsNonEmpty());

Would this now be safe to serialize? My instinct says that yes, it should be safe, since there's no reason that interfaces in java.util.function should be treated any differently from any other random interface. But I'm still wary.

like image 642
Justin Avatar asked Jun 24 '16 16:06

Justin


1 Answers

It depends on which kind of safety you want. It’s not the case that serialized lambdas cannot be shared between different JREs. They have a well defined persistent representation, the SerializedLambda. When you study, how it works, you’ll find that it relies on the presence of the defining class, which will have a special method that reconstructs the lambda.

What makes it unreliable is the dependency to compiler specific artifacts, e.g. the synthetic target method, which has some generated name, so simple changes like the insertion of another lambda expression or recompiling the class with a different compiler can break the compatibility to existing serialized lambda expression.

However, using manually written classes isn’t immune to this. Without an explicitly declared serialVersionUID, the default algorithm will calculate an id by hashing class artifacts, including private and synthetic ones, adding a similar compiler dependency. So the minimum to do, if you want reliable persistent forms, is to declare an explicit serialVersionUID.

Or you turn to the most robust form possible:

public enum IsNonEmpty implements Predicate<String> {
    INSTANCE;

    @Override
    public boolean test(String s) {
        return !s.isEmpty();
    }
}

Serializing this constant does not store any properties of the actual implementation, besides its class name (and the fact that it is an enum, of course) and a reference to the name of the constant. Upon deserialization, the actual unique instance of that name will be used.


Note that serializable lambda expressions may create security issues because they open an alternative way of getting hands on an object that allows to invoke the target methods. However, this applies to all serializable classes, as all variant shown in your question and this answer allow to deliberately deserialize an object allowing to invoke the encapsulated operation. But with explicit serializable classes, the author is usually more aware of this fact.

like image 68
Holger Avatar answered Oct 20 '22 01:10

Holger