Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

JPARepository returns duplicated child entities [duplicate]

I'm developing the chat feature in my app. The problem occurs when I try to retrieve a Chat object from my database, all the messages are retrieved duplicated twice.

Project structure

This is the simplified structure followed:

UML diagram

And this are the entities (simplified):

@Entity
public class Chat {
    @Id
    @GeneratedValue
    private Long id;

    @ManyToMany
    @JoinTable(
            name = "user_chat",
            joinColumns = @JoinColumn(name = "chat"),
            inverseJoinColumns = @JoinColumn(name = "user")
    )
    private Set<User> users = new HashSet<>();

    @OneToMany(mappedBy = "chat", cascade = CascadeType.ALL)
    private List<Message> messages = new ArrayList<>();


    public void addMessage(Message message){
        this.messages.add(message);
    }
}


@Entity
public class Message {
    private @Id @GeneratedValue Long id;

    @ManyToOne
    @JoinColumn(name="senderId", nullable=false)
    private User sender;

    @Column(name="value")
    private String value;

    @ManyToOne()
    @JoinColumn(name = "chat")
    private Chat chat;
}


@Entity
public class User implements UserDetails {
    private @Id
    @GeneratedValue
    Long id;

    @OneToMany(mappedBy = "sender", cascade = CascadeType.ALL)
    private List<Message> messagesSent;

    @ManyToMany(mappedBy = "users")
    private Set<Chat> chats = new HashSet<>();
}

And this is the repository which is failing:

public interface ChatRepository extends JpaRepository<Chat,Long> {
    @Query("SELECT c FROM Chat c JOIN c.users u WHERE u.id IN :usersIDs GROUP BY c HAVING COUNT(DISTINCT u) = :numberOfUsers")
    Optional<Chat> getChat(@Param("usersIDs") List<Long> usersIDs, int numberOfUsers);
}

The problem

Here you can see an example of how the object is retrieved:

Debugger

As you can see, every Message object is duplicated

What I suppose the problem is

Researching and executing the raw SQL query to the DB, I come across with the assumption that the problem can be this: Hibernate is making the query this way: Join between Chat and Users and then Join between the result and Messages. This gives messages duplicated because it joins each message with Chat-User1 and Chat-User2 rows. Just this way:

SELECT message.id as messageID, chat.id as chatID, user.id as userID FROM chat JOIN user JOIN message;

SQL response

What I have tried until here

In response of the assumption I explained before, I tried changing the repository query this way:

    public interface ChatRepository extends JpaRepository<Chat,Long> {
        @Query("SELECT c FROM Chat c JOIN c.users u LEFT JOIN c.messages m ON m.sender = u AND m.chat = c WHERE u.id IN :usersIDs GROUP BY c HAVING COUNT(DISTINCT u) = :numberOfUsers")
        Optional<Chat> getChat(@Param("usersIDs") List<Long> usersIDs, int numberOfUsers);
    }

Query breakdown:

SELECT c FROM Chat c JOIN c.users u Chat and Users joined

LEFT JOIN c.messages m ON m.sender = u AND m.chat = c Join between the result and messages

WHERE u.id IN :usersIDs Filtering just requested users

GROUP BY c HAVING COUNT(DISTINCT u) = :numberOfUsers") Retrieving the Chat which has all the users requested

As you can see, the difference lies in the ON m.sender = u AND m.chat = c. This way, the Messages must also join the User, like in this query:

    SELECT message.id as messageID, chat.id as chatID, user.id as userID FROM chat JOIN user JOIN message ON message.sender_id = user.id and message.chat = chat.id;

SQL response

and the LEFT before the JOIN is for getting the users who hasn't send message to the chat but are members.

However, this query is not solving the problem. Returns the Chat with the messages duplicated as well.

This is a list of other things I have tried:

  • Adding @Fetch (FetchMode.SELECT) before private List<Message> messages = new ArrayList<>();
  • Using @Transactional before the query
  • Overriding equals(Object o) in Message

This is how the message.equals() method looks like:

@Override
public boolean equals(Object o) {
    if (this == o) return true;
    if (o == null || getClass() != o.getClass()) return false;
    Message message = (Message) o;
    return Objects.equals(id, message.id) && Objects.equals(sender, message.sender) && Objects.equals(value, message.value) && Objects.equals(chat, message.chat);
}

@Override
public int hashCode() {
    return Objects.hash(id, sender, value, chat);
}

Other SO posts which I have read

JPA - EntityManager find method returns duplicates

Multiple fetches with EAGER type in Hibernate with JPA

Updates

In response on what @EdmarSM have told me in the comments, I have modified my query like this:

@Query("SELECT DISTINCT c FROM Chat c INNER JOIN FETCH c.users u LEFT JOIN FETCH c.messages m WHERE u.id IN :usersIDs GROUP BY c HAVING COUNT(DISTINCT u) = :numberOfUsers")

However, I've received the next error:

Expression #3 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'close.m1_0.id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

I can see that this error is produced because of the INNER JOIN FETCH part. Without it, there is no error.

I also tried using the query without that part, just adding the DISTINCT to the query I've proposed above.

@Query("SELECT DISTINCT c FROM Chat c JOIN c.users u LEFT JOIN c.messages m ON m.sender = u AND m.chat = c WHERE u.id IN :usersIDs GROUP BY c HAVING COUNT(DISTINCT u) = :numberOfUsers")

But the response was exactly the same, the Chat with duplicated messages.

Second update

Again, according with @EdmarSM has told me, I tried with this:

public interface ChatRepository extends JpaRepository<Chat,Long> {
    @Query("SELECT DISTINCT c FROM Chat c INNER JOIN FETCH c.users u LEFT JOIN FETCH c.messages m WHERE c.id IN (SELECT c2.id FROM Chat c2 INNER JOIN c2.users u2 WHERE u2.id IN :usersIDs GROUP BY c2 HAVING COUNT(DISTINCT u2) = :numberOfUsers)")
    Optional<Chat> getChat(@Param("usersIDs") List<Long> usersIDs, int numberOfUsers);
}

I thought it has solved my problem but it hasn't. It just changed the way the messages were duplicated. Before, it used to be this way:

AABBCCDDEE...

And with this implementation, the pattern is ABCDEABCDE

In other words, it use to be:

  • Message 1
  • Message 1
  • Message 2
  • Message 2
  • Message 3
  • Message 3
  • Message 4
  • Message 4

And now is:

  • Message 1
  • Message 2
  • Message 3
  • Message 4
  • Message 1
  • Message 2
  • Message 3
  • Message 4
like image 270
Lucas Tomic Avatar asked Sep 15 '25 13:09

Lucas Tomic


1 Answers

When your query executes, your model will always return a cartesian plan composed of the 3 tables.

I believe that in this case what you should have to do is implement the DISTINCT clause into your projection, this way Hibernate would be able to compare the objects and add to the list of messages only one of each different message.

    public interface ChatRepository extends JpaRepository<Chat,Long> {
        @Query("SELECT DISTINCT c FROM Chat c INNER JOIN FETCH c.users u LEFT JOIN FETCH c.messages m WHERE c.id IN (SELECT c2.id FROM Chat c2 INNER JOIN c2.users u2 WHERE u2.id IN :usersIDs GROUP BY c2 HAVING COUNT(DISTINCT u2) = :numberOfUsers)")
        Optional<Chat> getChat(@Param("usersIDs") List<Long> usersIDs, int numberOfUsers);
    }

I have also added FETCH to your join to retrieve messages into a single shot, but that also can make the query bigger and add another instance of the user table to retrieve the message's user instead of reusing the same of the chat's user.

Besides distinct would have no effect on database query, Hibernate would know that the result must pass on a reduction process to eliminate duplicity.

Also, this article could be helpful: https://vladmihalcea.com/jpql-distinct-jpa-hibernate/

Notice that I've added the user count into a subquery, this way it is expected that you will not have an error in the group by clause.

like image 159
EdmarSM Avatar answered Sep 18 '25 02:09

EdmarSM