Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting duplicate items when querying a collection with Spring Data Rest

I'm having duplicate results on a collection with this simple model: an entity Module and an entity Page. A Module has a set of pages, and a Page belongs to the module.

This is set up with Spring Boot with Spring Data JPA and Spring Data Rest.

The full code is accessible on GitHub

Entities

Here's the code for the entities. Most setters removed for brevity:

Module.java

@Entity
@Table(name = "dt_module")
public class Module {
  private Long id;
  private String label;
  private String displayName;
  private Set<Page> pages;

  @Id
  public Long getId() {
    return id;
  }

  public String getLabel() {
    return label;
  }

  public String getDisplayName() {
    return displayName;
  }

  @OneToMany(mappedBy = "module")
  public Set<Page> getPages() {
    return pages;
  }

  public void addPage(Page page) {
    if (pages == null) {
      pages = new HashSet<>();
    }
    pages.add(page);
    if (page.getModule() != this) {
      page.setModule(this);
    }
  }

  @Override
  public boolean equals(Object o) {
    if (this == o) return true;
    if (o == null || getClass() != o.getClass()) return false;
    Module module = (Module) o;
    return Objects.equals(label, module.label) && Objects.equals(displayName, module.displayName);
  }

  @Override
  public int hashCode() {
    return Objects.hash(label, displayName);
  }
}

Page.java

@Entity
@Table(name = "dt_page")
public class Page {
  private Long id;
  private String name;
  private String action;
  private String description;
  private Module module;

  @Id
  public Long getId() {
    return id;
  }

  public String getName() {
    return name;
  }

  public String getAction() {
    return action;
  }

  public String getDescription() {
    return description;
  }

  @ManyToOne
  public Module getModule() {
    return module;
  }

  public void setModule(Module module) {
    this.module = module;
    this.module.addPage(this);
  }

  @Override
  public boolean equals(Object o) {
    if (this == o) return true;
    if (o == null || getClass() != o.getClass()) return false;
    Page page = (Page) o;
    return Objects.equals(name, page.name) &&
        Objects.equals(action, page.action) &&
        Objects.equals(description, page.description) &&
        Objects.equals(module, page.module);
  }

  @Override
  public int hashCode() {
    return Objects.hash(name, action, description, module);
  }
}

Repositories

Now the code for the Spring repositories, which is fairly simple:

ModuleRepository.java

@RepositoryRestResource(collectionResourceRel = "module", path = "module")
public interface ModuleRepository extends PagingAndSortingRepository<Module, Long> {
}

PageRepository.java

@RepositoryRestResource(collectionResourceRel = "page", path = "page")
public interface PageRepository extends PagingAndSortingRepository<Page, Long> {
}

Config

The configuration comes from 2 files:

Application.java

@EnableJpaRepositories
@SpringBootApplication
public class Application {
  public static void main(String[] args) {
    SpringApplication.run(Application.class, args);
  }
}

application.properties

spring.jpa.database = H2

spring.jpa.database-platform=org.hibernate.dialect.H2Dialect
spring.jpa.generate-ddl=false
spring.jpa.hibernate.ddl-auto=validate

spring.datasource.initialize=true
spring.datasource.url=jdbc:h2:mem:demo;DB_CLOSE_DELAY=-1;DB_CLOSE_ON_EXIT=FALSE
spring.datasource.driverClassName=org.h2.Driver
spring.datasource.username=sa
spring.datasource.password=

spring.data.rest.basePath=/api

Database

Finally the db schema and some test data:

schema.sql

drop table if exists dt_page;
drop table if exists dt_module;

create table DT_MODULE (
  id IDENTITY  primary key,
  label varchar(30) not NULL,
  display_name varchar(40) not NULL
);

create table DT_PAGE (
  id IDENTITY primary key,
  name varchar(50) not null,
  action varchar(50) not null,
  description varchar(255),
  module_id bigint not null REFERENCES dt_module(id)
);

data.sql

INSERT INTO DT_MODULE (label, display_name) VALUES ('mod1', 'Module 1'), ('mod2', 'Module 2'), ('mod3', 'Module 3');
INSERT INTO DT_PAGE (name, action, description, module_id) VALUES ('page1', 'action1', 'desc1', 1);

That's about it. Now, I run thus from the command line to start the application: mvn spring-boot:run. After the application starts, I can query it's main endpoint like this:

Get API
$ curl http://localhost:8080/api
Response
{
  "_links" : {
    "page" : {
      "href" : "http://localhost:8080/api/page{?page,size,sort}",
      "templated" : true
    },
    "module" : {
      "href" : "http://localhost:8080/api/module{?page,size,sort}",
      "templated" : true
    },
    "profile" : {
      "href" : "http://localhost:8080/api/alps"
    }
  }
}
Get all modules
curl http://localhost:8080/api/module
Response
{
  "_links" : {
    "self" : {
      "href" : "http://localhost:8080/api/module"
    }
  },
  "_embedded" : {
    "module" : [ {
      "label" : "mod1",
      "displayName" : "Module 1",
      "_links" : {
        "self" : {
          "href" : "http://localhost:8080/api/module/1"
        },
        "pages" : {
          "href" : "http://localhost:8080/api/module/1/pages"
        }
      }
    }, {
      "label" : "mod2",
      "displayName" : "Module 2",
      "_links" : {
        "self" : {
          "href" : "http://localhost:8080/api/module/2"
        },
        "pages" : {
          "href" : "http://localhost:8080/api/module/2/pages"
        }
      }
    }, {
      "label" : "mod3",
      "displayName" : "Module 3",
      "_links" : {
        "self" : {
          "href" : "http://localhost:8080/api/module/3"
        },
        "pages" : {
          "href" : "http://localhost:8080/api/module/3/pages"
        }
      }
    } ]
  },
  "page" : {
    "size" : 20,
    "totalElements" : 3,
    "totalPages" : 1,
    "number" : 0
  }
}
Get all pages for one module
curl http://localhost:8080/api/module/1/pages
Response
{
  "_links" : {
    "self" : {
      "href" : "http://localhost:8080/api/module/1/pages"
    }
  },
  "_embedded" : {
    "page" : [ {
      "name" : "page1",
      "action" : "action1",
      "description" : "desc1",
      "_links" : {
        "self" : {
          "href" : "http://localhost:8080/api/page/1"
        },
        "module" : {
          "href" : "http://localhost:8080/api/page/1/module"
        }
      }
    }, {
      "name" : "page1",
      "action" : "action1",
      "description" : "desc1",
      "_links" : {
        "self" : {
          "href" : "http://localhost:8080/api/page/1"
        },
        "module" : {
          "href" : "http://localhost:8080/api/page/1/module"
        }
      }
    } ]
  }
}

So as you can see, I'm getting the same page twice here. What's going on?

Bonus question: Why this works?

I was cleaning the code to submit this question, and in order to make it more compact, I moved the JPA Annotations on the Page entity to field level, like this:

Page.java

@Entity
@Table(name = "dt_page")

public class Page {
  @Id
  private Long id;
  private String name;
  private String action;
  private String description;

  @ManyToOne
  private Module module;
  ...

All the rest of the class remains the same. This can be seen on the same github repo on branch field-level.

As it turns out, executing the same request with that change to the API will render the expected result (after starting the server the same way I did before):

Get all pages for one module
curl http://localhost:8080/api/module/1/pages
Response
{
  "_links" : {
    "self" : {
      "href" : "http://localhost:8080/api/module/1/pages"
    }
  },
  "_embedded" : {
    "page" : [ {
      "name" : "page1",
      "action" : "action1",
      "description" : "desc1",
      "_links" : {
        "self" : {
          "href" : "http://localhost:8080/api/page/1"
        },
        "module" : {
          "href" : "http://localhost:8080/api/page/1/module"
        }
      }
    } ]
  }
}
like image 696
alejo Avatar asked Jul 02 '15 05:07

alejo


People also ask

Why does JPA return duplicate rows?

Issue with @Id column, If we check closely, @Id column value is same for all the rows. Hence hibernate/JPA not able to get different records, it just get 1st record with this @Id and return duplicate records of it. Solution - Use @IdClass with columns which result in unique row instead of duplicate row.

Why do data sets duplicate?

Data has multiple representations – meaning the same data can be represented in different ways. This is the primary reason why duplicate records exist in a database.


1 Answers

This is causing your issue (Page Entity):

  public void setModule(Module module) {
    this.module = module;
    this.module.addPage(this); //this line right here
  }

Hibernate uses your setters to initialize the entity because you put the JPA annotations on getters.

Initialization sequence that causes the issue:

  1. Module object created
  2. Set Module properties (pages set is initialized)
  3. Page object created
  4. Add the created Page to Module.pages
  5. Set Page properties
  6. setModule is called on the Page object and this adds (addPage) the current Page to Module.pages the second time

You can put the JPA annotations on the fields and it will work, because setters won't be called during initialization (bonus question).

like image 75
Cyril Avatar answered Oct 05 '22 12:10

Cyril