I have to remove duplicates from an object list retrieved from a CrudRepository.
I done that :
if (!prospectionRepository.findAll().isEmpty()) {
List<Prospection> all = prospectionRepository.findAll();
for (int i = 0; i < all.size()-1; i++) {
for (int k = i+1;k < all.size(); k++) {
if (all.get(i).getProspectNumber() == all.get(k).getProspectNumber()) {
all.remove(all.get(i));
}
}
}
prospectionRepository.save(all);
}
However, duplicates are not removed from the list and they are then persisted when I do not want to.
After conversation in chat, additional parameters have to be taken into account:
- multiples
Prospectmay have the sameprospect numberbut they do have an unique primary key in the database. Consequently, the duplicate filter cannot rely on theProspectequality- a
Prospecthas a visited status which defined whether theProspecthas been contacted by the company or not. The two main status areNEWandMET. Only oneProspectcan beMET. Other duplicates (with the sameprospect number) can only beNEW
The problem needs to an additional step to be solved:
prospect number. At this stage, we will a <ProspectNumber, List<Prospect>> mapping. However, the List<Prospect> must end up with a single element according to the rules defined earlierConsequently, the list will be generated with the following rules:
met one is keptmet, then an arbitrary one is met: Stream does not guarantee to loop in the list order.The trick is to go through a Map as the key will hold the unicity. If your propect number is a specific type, this will assume that equals() and hashCode() are properly defined.
disclaimer: code is untested
List<Prospection> all = prospectionRepository.findAll().stream()
// we instantiate here a Map<ProspectNumber, Prospect>
// There is no need to have a Map<ProspectNumber, List<Propect>>
// as the merge function will do the sorting for us
.collect(Collectors.toMap(
// Key: use the prospect number
prospect -> prospect.getProspectNumber(),
// Value: use the propect object itself
prospect -> prospect,
// Merge function: two prospects with the same prospect number
// are found: keep the one with the MET status or the first one
(oldProspect, newProspect) -> {
if(oldProspect.getStatus() == MET){
return oldProspect;
} else if (newProspect.getStatus() == MET){
return newProspect;
} else{
// return the first one, arbitrary decision
return oldProspect;
}
}
))
// get map values only
.values()
// stream it in order to collect its as a List
.stream()
.collect(Collectors.toList());
prospectionRepository.save(all);
Map.values() actually return a Collection. So if your prospectionRepository.save(...) can accept a Collection (not only List), you can go faster. I also use the following synonym:
Prospect::getProspectNumber is the Function equivalent to prospect -> prospect.getProspectNumber()Function.identity(): is equivalent to prospect -> prospectCollection<Prospection> all = prospectionRepository.findAll().stream()
.collect(Collectors.toMap(
Prospect::getProspectNumber,
Function.identity(),
(oldProspect, newProspect) -> newProspect.getStatus() == MET ? newProspect : oldProspect
)).values();
prospectionRepository.save(all);
For your information, if two Prospection having the same ProspectNumber are equals, then a simple distinct() would have been enough:
List<Prospection> all = prospectionRepository.findAll()
.stream()
.distinct()
.collect(Collectors.toList());
prospectionRepository.save(all);
You can use something like
List<Prospection> all = prospectionRepository.findAll();
Set<Object> prospectNumbers = new HashSet<Object>();
Iterator<Prospection> it = all.iterator();
while (it.hasNext()) {
Prospection item = iterator.next();
Object itemNumer = item.getProspectNumber();
if (prospectNumbers.contains(itemNumber)) {
it.remove();
} else {
prospectNumbers.add(itemNumber);
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With