Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

compare java element from a findAll()

I have to remove duplicates from an object list retrieved from a CrudRepository.

I done that :

if (!prospectionRepository.findAll().isEmpty()) {
    List<Prospection> all = prospectionRepository.findAll();
    for (int i = 0; i < all.size()-1; i++) {
        for (int k = i+1;k < all.size(); k++) {
            if (all.get(i).getProspectNumber() == all.get(k).getProspectNumber()) {
                all.remove(all.get(i));
            }
        }
    }
    prospectionRepository.save(all);
}

However, duplicates are not removed from the list and they are then persisted when I do not want to.

like image 991
Fosfor Avatar asked Apr 08 '26 12:04

Fosfor


2 Answers

Problem edit

After conversation in chat, additional parameters have to be taken into account:

  1. multiples Prospect may have the same prospect number but they do have an unique primary key in the database. Consequently, the duplicate filter cannot rely on the Prospect equality
  2. a Prospect has a visited status which defined whether the Prospect has been contacted by the company or not. The two main status are NEW and MET. Only one Prospect can be MET. Other duplicates (with the same prospect number) can only be NEW

Algorithm

The problem needs to an additional step to be solved:

  1. The prospects need to be grouped by prospect number. At this stage, we will a <ProspectNumber, List<Prospect>> mapping. However, the List<Prospect> must end up with a single element according to the rules defined earlier
  2. Within a list, then if the prospect has not be met AND another prospect is found with a met status, then the first prospect is to be discarded

Consequently, the list will be generated with the following rules:

  • If a prospect has no duplicate in terms of prospect numbers, it is kept regardless its status
  • If a prospect has duplicate in terms of prospect number, only the met one is kept
  • If multiple prospects have the same prospect number but no one is met, then an arbitrary one is met: Stream does not guarantee to loop in the list order.

Code

The trick is to go through a Map as the key will hold the unicity. If your propect number is a specific type, this will assume that equals() and hashCode() are properly defined.

disclaimer: code is untested

List<Prospection> all = prospectionRepository.findAll().stream()
        // we instantiate here a Map<ProspectNumber, Prospect>
        // There is no need to have a Map<ProspectNumber, List<Propect>> 
        // as the merge function will do the sorting for us
        .collect(Collectors.toMap(
                // Key: use the prospect number
                prospect -> prospect.getProspectNumber(),
                // Value: use the propect object itself
                prospect -> prospect,
                // Merge function: two prospects with the same prospect number
                // are found: keep the one with the MET status or the first one
                (oldProspect, newProspect) -> {
                    if(oldProspect.getStatus() == MET){
                        return oldProspect;
                    } else if (newProspect.getStatus() == MET){
                        return newProspect;
                    } else{
                        // return the first one, arbitrary decision
                        return oldProspect;
                    }
                }
        ))
        // get map values only
        .values()
        // stream it in order to collect its as a List
        .stream()
        .collect(Collectors.toList());
prospectionRepository.save(all);

Map.values() actually return a Collection. So if your prospectionRepository.save(...) can accept a Collection (not only List), you can go faster. I also use the following synonym:

  • static method reference: Prospect::getProspectNumber is the Function equivalent to prospect -> prospect.getProspectNumber()
  • Function.identity(): is equivalent to prospect -> prospect
  • Ternary operator: it returns the same thing but written differently
Collection<Prospection> all = prospectionRepository.findAll().stream()
        .collect(Collectors.toMap(
                Prospect::getProspectNumber,
                Function.identity(),
                (oldProspect, newProspect) -> newProspect.getStatus() == MET ? newProspect : oldProspect
        )).values();
prospectionRepository.save(all);

For your information, if two Prospection having the same ProspectNumber are equals, then a simple distinct() would have been enough:

List<Prospection> all = prospectionRepository.findAll()
        .stream()
        .distinct()
        .collect(Collectors.toList());
prospectionRepository.save(all);

like image 63
Al-un Avatar answered Apr 10 '26 02:04

Al-un


You can use something like

List<Prospection> all = prospectionRepository.findAll();
Set<Object> prospectNumbers = new HashSet<Object>();
Iterator<Prospection> it = all.iterator();
while (it.hasNext()) {
    Prospection item = iterator.next();
    Object itemNumer = item.getProspectNumber();
    if (prospectNumbers.contains(itemNumber)) {
        it.remove();
    } else {
        prospectNumbers.add(itemNumber);
    }
}
like image 26
Aleh Maksimovich Avatar answered Apr 10 '26 01:04

Aleh Maksimovich



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!