Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Stream API not working for lazy loaded collections in EclipseLink / Glassfish?

After detecting a flaw in one of my web services I tracked down the error to the following one-liner:

return this.getTemplate().getDomains().stream().anyMatch(domain -> domain.getName().equals(name));

This line was returning false when I positively knew that the list of domains contained a domain which name was equal to the provided name. So after scratching my head for a while, I ended up splitting the whole line to see what was going on. I got the following in my debugging session:

Screenshot of debugging session

Please notice the following line:

List<Domain> domains2 = domains.stream().collect(Collectors.toList());

According to the debugger, domains is a list with two elements. But after applying .stream().collect(Collectors.toList()) I get a completely empty list. Correct me if I'm wrong, but from what I understand, that should be the identity operation and return the same list (or a copy of it if we are strict). So what is going on here???

Before you ask: No, I haven't manipulated that screenshot at all.

To put this in context, this code is executed in a stateful request scoped EJB using JPA managed entities with field access in a extended persistence context. Here you have some parts of the code relevant to the problem at hand:

@Stateful
@RequestScoped
@Consumes(MediaType.APPLICATION_JSON)
@Produces(MediaType.APPLICATION_JSON)
public class DomainResources {
    @PersistenceContext(type = PersistenceContextType.EXTENDED) @RequestScoped
    private EntityManager entityManager;

    public boolean templateContainsDomainWithName(String name) { // Extra code included to diagnose the problem
        MetadataTemplate template = this.getTemplate();
        List<Domain> domains = template.getDomains();
        List<Domain> domains2 = domains.stream().collect(Collectors.toList());
        List<String> names = domains.stream().map(Domain::getName).collect(Collectors.toList());
        boolean exists1 = names.contains(name);
        boolean exists2 = this.getTemplate().getDomains().stream().anyMatch(domain -> domain.getName().equals(name));
        return this.getTemplate().getDomains().stream().anyMatch(domain -> domain.getName().equals(name));
    }

    @POST
    @RolesAllowed({"root"})
    public Response createDomain(@Valid @EmptyID DomainDTO domainDTO, @Context UriInfo uriInfo) {
        if (this.getTemplate().getLastVersionState() != State.DRAFT) {
            throw new UnmodifiableTemplateException();
        } else if (templateContainsDomainWithName(domainDTO.name)) {
            throw new DuplicatedKeyException("name", domainDTO.name);
        } else {
            Domain domain = this.getTemplate().createNewDomain(domainDTO.name);
            this.entityManager.flush();
            return Response.created(uriInfo.getAbsolutePathBuilder().path(domain.getId()).build()).entity(new DomainDTO(domain)).type(MediaType.APPLICATION_JSON).build();
        }
    }
}

@Entity
public class MetadataTemplate extends IdentifiedObject {
    @OneToMany(cascade = CascadeType.ALL, fetch = FetchType.EAGER, mappedBy = "metadataTemplate", orphanRemoval = true) @OrderBy(value = "creationDate")
    private List<Version> versions = new LinkedList<>();
    @OneToMany(cascade = CascadeType.ALL, fetch = FetchType.LAZY, orphanRemoval = true) @OrderBy(value = "name")
    private List<Domain> domains = new LinkedList<>();

    public List<Version> getVersions() {
        return Collections.unmodifiableList(versions);
    }

    public List<Domain> getDomains() {
        return Collections.unmodifiableList(domains);
    }
}

I've included both getVersions and getDomains methods because I have similar operations running flawlessly on versions. The only significant difference I'm able to find is that versions are eagerly fetched while domains are lazily fetched. But as far as I know the code is being executed inside a transaction and the list of domains is being loaded. If not I'd get a lazy initialization exception, wouldn't I?

UPDATE: Following @Ferrybig 's suggestion I've investigated the issue a bit further, and there doesn't seem to have anything to do with improper lazy loading. If I traverse the collection in a classic way I still can't get proper results using streams:

boolean found = false;
for (Domain domain: this.getTemplate().getDomains()) {
    if (domain.getName().equals(name)) {
        found = true;
    }
}

List<Domain> domains = this.getTemplate().getDomains();
long estimatedSize = domains.spliterator().estimateSize(); // This returns 0!
domains.spliterator().forEachRemaining(domain -> {
    // Execution flow never reaches this point!
});

So it seems that even when the collection has been loaded you still have that odd behavior. This seems to be a missing or empty spliterator implementation in the proxy used to manage lazy collections. What do you think?

BTW, this is deployed on Glassfish / EclipseLink

like image 839
José González Avatar asked Feb 12 '16 12:02

José González


2 Answers

The problem here comes from a combination of somebody else's mistakes in several places. The sum of all those mistakes provokes this buggy behavior.

First mistake: Dubious inheritance. EclipseLink seems to create a proxy to manage lazy collections of type org.eclipse.persistence.indirection.IndirectList. This class extends java.util.Vector although it overrides everything but removeRange. Why on earth, dear Eclipse developers, do you extend a class to override almost everything in the parent, instead of declaring that class to implement a suitable interface (Iterable<E>, Collection<E> or List<E>)?

Second mistake: Hey, I inherit from you but don't give a $#|T about your internals. So IndirectList does its magic of lazy loading things using a delegate. But, oh my! How do I compute size? Do I use (and maintain updated) the parent's elementCount property? No, of course, I just delegate that task to my delegate... so if the parent class needs to do anything related to size, well, bad luck. Anyway I've overrided everything... and they won't add anything new to that class, will they?

Third mistake: Encapsulation breakage. Enters Vector. In Java 1.8 this class is augmented and now provides a spliterator method to support the new stream functionalities. They create a static inner class (VectorSpliterator) that lets clients traverse the vector using the shiny new API. Everything ok until you notice that in order to know when to finish the traversal they use the protected instance variable elementCount instead of using the public API method size(). Because who would extend a non final class and return a size not based on elementCount? Do you see the disaster coming?

So here we are, IndirectList is unawarely inheriting new functionality from Vector (remember that it probably shouldn't inherit from it in first place), and breaking things with this combination of mistakes.

Summing up, it seems that stream traversal of lazy collections won't work even for already loaded collections when using EclipseLink (default JPA provider in Glassfish). Remember that these products come from the same vendor. Hooray!

WORKAROUND: In case you face this problem and still want to leverage the functional programming style provided by stream() you can make a copy of the collection so a proper iterator is built. In my case I was able to keep all similar uses of domains as one-liners modifying the getDomains method. I favor code readability (with functional style) over performance in this case:

public List<Domain> getDomains() {
    return Collections.unmodifiableList(new ArrayList<>(domains));
}

NOTE TO THE READER: Sorry for the sarcasm, but I hate to lose my precious development time with these things.

Thanks to @Ferrybig for the initial clue

UPDATE: Bug reported. If this has hit you, you can follow its progress at https://bugs.eclipse.org/bugs/show_bug.cgi?id=487799

like image 200
José González Avatar answered Oct 20 '22 18:10

José González


I've hit a very similar issue with this code in a unit test:

Optional<ChildTable> ct = st.getChildren().stream().filter(i -> i.getId().equals(20001000l)).findFirst();

ct.get() failed with a NoSuchElementException.

Updating EclipseLink from 2.5.2 to 2.6.2 solved this issue. You didn't mention the EclipseLink version.

I think your bug report is a duplicate of https://bugs.eclipse.org/bugs/show_bug.cgi?id=433075.

See also the unresolved bug with EclipseLink and Java 8 stream API https://bugs.eclipse.org/bugs/show_bug.cgi?id=467470.

like image 35
Christian Avatar answered Oct 20 '22 18:10

Christian