at the moment I develop a Spring Boot application which mainly pulls product review data from a message queue (~5 concurrent consumer) and stores them to a MySQL DB. Each review can be uniquely identified by its reviewIdentifier (String), which is the primary key and can belong to one or more product (e.g. products with different colors). Here is an excerpt of the data-model:
public class ProductPlacement implements Serializable{
private static final long serialVersionUID = 1L;
@Id
@GeneratedValue(strategy = GenerationType.AUTO)
@Column(name = "product_placement_id")
private long id;
@ManyToMany(fetch = FetchType.LAZY, cascade = CascadeType.ALL, mappedBy="productPlacements")
private Set<CustomerReview> customerReviews;
}
public class CustomerReview implements Serializable{
private static final long serialVersionUID = 1L;
@Id
@Column(name = "customer_review_id")
private String reviewIdentifier;
@ManyToMany(fetch = FetchType.LAZY, cascade = CascadeType.ALL)
@JoinTable(
name = "tb_miner_review_to_product",
joinColumns = @JoinColumn(name = "customer_review_id"),
inverseJoinColumns = @JoinColumn(name = "product_placement_id")
)
private Set<ProductPlacement> productPlacements;
}
One message from the queue contains 1 - 15 reviews and a productPlacementId. Now I want an efficient method to persist the reviews for the product. There are basically two cases which need to be considered for each incomming review:
Currently my method for persisting the reviews is not optimal. It looks as follows (uses Spring Data JpaRespoitories):
@Override
@Transactional
public void saveAllReviews(List<CustomerReview> customerReviews, long productPlacementId) {
ProductPlacement placement = productPlacementRepository.findOne(productPlacementId);
for(CustomerReview review: customerReviews){
CustomerReview cr = customerReviewRepository.findOne(review.getReviewIdentifier());
if (cr!=null){
cr.getProductPlacements().add(placement);
customerReviewRepository.saveAndFlush(cr);
}
else{
Set<ProductPlacement> productPlacements = new HashSet<>();
productPlacements.add(placement);
review.setProductPlacements(productPlacements);
cr = review;
customerReviewRepository.saveAndFlush(cr);
}
}
}
Questions:
Update to question 1: Would a simple @Lock on my Review-Repository prefent the unique-constraint exception?
@Lock(LockModeType.PESSIMISTIC_WRITE)
CustomerReview findByReviewIdentifier(String reviewIdentifier);
What happens when the findByReviewIdentifier returns null? Can hibernate lock the reviewIdentifier for a potential insert even if the method returns null?
Thank you!
From a performance point of view, I will consider evaluating the solution with the following changes.
I had a same question on which one is more efficient from DML statements that gets executed. Quoting from Typical ManyToMany mapping versus two OneToMany.
The option one might be simpler from a configuration perspective, but it yields less efficient DML statements.
Use the second option because whenever the associations are controlled by @ManyToOne associations, the DML statements are always the most efficient ones.
Enabling the batching support would result in less number of round trips to the database to insert/update the same number of records.
Quoting from batch INSERT and UPDATE statements
hibernate.jdbc.batch_size = 50
hibernate.order_inserts = true
hibernate.order_updates = true
hibernate.jdbc.batch_versioned_data = true
The current code gets the ProductPlacement
and for each review
it does a saveAndFlush
, which results in no batching of DML statements.
Instead I would consider loading the ProductPlacement
entity and adding the List<CustomerReview> customerReviews
to the Set<CustomerReview> customerReviews
field of ProductPlacement
entity and finally call the merge
method once at the end, with these two changes:
ProductPlacement
entity owner of the association i.e., by moving mappedBy
attribute onto Set<ProductPlacement> productPlacements
field of CustomerReview
entity.CustomerReview
entity implement equals
and hashCode
method by using reviewIdentifier
field in these method. I believe reviewIdentifier
is unique and user assigned.Finally, as you do performance tuning with these changes, baseline your performance with the current code. Then make the changes and compare if the changes are really resulting in the any significant performance improvement for your solution.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With