Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Duplicates using left join fetch

Tags:

I'm using a HQL query to obtain certain records. If I use LEFT JOIN FETCH a collection which is in my target entity will contain duplicate records. If I only use left join, it won't. I guess when Hibernate lazily loads the records, it avoids duplicates.

"SELECT z FROM ", TableA.TABLE_NAME, " z ",  //
            "LEFT JOIN FETCH z.attributeX pv ", //
            "LEFT JOIN FETCH pv.attributeY anteil ", //
            "LEFT JOIN FETCH z.attributeB kma ", //
            "LEFT JOIN FETCH kma.attributeC ", //
            "WHERE anteil.attributeD.attributeE.id = :eId ", //
            "AND pv.attributeG.id = :gId ");

My entity TableA has a link to to TablePV (LEFT JOIN FETCH z.attributeX pv)

TablePV has a collection of TableY (LEFT JOIN FETCH pv.attributeY anteil)

Now Hibernate will map everything correctly except for the children of TablePV. It will contain several times the same record. A distinct on TableA doesn't help, since there are no duplicates there. I could manually remove those records which would be quite nonperforming I guess.

like image 521
Daniel Avatar asked Mar 15 '12 13:03

Daniel


People also ask

What does left join fetch do?

JOIN FETCH (or LEFT JOIN FETCH ) will collect all the associations along with their owner object. Meaning that the collection will be retrieved in the same select. This can be shown by enabling Hibernate's statistics. A (left/outer) join fetch is great for *ToOne (many-to-one or one-to-one) associations.

How do I join two tables without duplicates?

The SQL UNION ALL operator is used to combine the result sets of 2 or more SELECT statements. It does not remove duplicate rows between the various SELECT statements (all rows are returned). Each SELECT statement within the UNION ALL must have the same number of fields in the result sets with similar data types.


1 Answers

The only way to really guarantee is using Set or SortedSet on these collections instead of use List. There is no other way officially to avoid this problem using Hibernate:

@OneToMany
private Set<AttributeY> attributeY;

You can read this tip in an old Hibernate documentation:

Queries that make use of eager fetching of collections usually return duplicates of the root objects, but with their collections initialized. You can filter these duplicates through a Set.

Or some kind of reference for the same problem on a newer one:

The only difference is that Set doesn’t allow duplicates, but this constraint is enforced by the Java object contract rather than the database mapping.

Set and order

If you would like to use Set and control the order of the entities, you can use SortedSet and implements Comparable on the child entities:

@OneToMany
@SortNatural
private SortedSet<AttributeY> attributeY = new TreeSet<>();

And:

@Entity
public class AttributeY implements Comparable<AttributeY> {

    @Override
    public int compareTo(AttributeY o) {
        return number.compareTo( o.getNumber() );
    }

}

To a custom sorting logic, you can use @SortComparator.

Precautions

Without more details it's hard to say why this happen in some cases using List and another cases don't. But you can try to implement equals/hashCode methods using the "business key" of the entity:

When using sets, it’s very important to supply proper equals/hashCode implementations for child entities. In the absence of a custom equals/hashCode implementation logic, Hibernate will use the default Java reference-based object equality which might render unexpected results when mixing detached and managed object instances.

Also, you are applying a condition using the FETCH alias pv and anteil. Don't do that. And get rid off the "train wreck" on your JPQL (anteil.attributeD.attributeE.id), because this can make Hibernate create weird SQLs (like doing the same JOIN more than once or invalid SQLs). So, make the JOINs explicit and not using the FETCH alias on WHERE:

LEFT JOIN FETCH z.attributeX
LEFT JOIN FETCH pv.attributeY
LEFT JOIN FETCH z.attributeB kma
LEFT JOIN FETCH kma.attributeC
LEFT JOIN pv.attributeY anteil
LEFT JOIN anteil.attributeD attributeD
LEFT JOIN attributeD.attributeE attributeE
LEFT JOIN z.attributeX pv
LEFT JOIN pv.attributeG attributeG
WHERE attributeE.id = :eId 
AND attributeG.id = :gId

If the duplication was in the root entity TableA, the DISTINCT will help, but it's not your case.

like image 187
Dherik Avatar answered Sep 19 '22 21:09

Dherik