Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove duplicates from List<Object> based on condition

Starting point:

public class Employee {
    private String id;
    private String name;
    private String age;
}

I have a list of Employee: List<Employee> employee;

Employee examples from the list:

{id="1", name="John", age=10}
{id="2", name="Ana", age=12}
{id="3", name="John", age=23}
{id="4", name="John", age=14}

Let's assume that age is unique.

How can I remove all duplicates from the list based on the name property and to keep in the output the entry with the largest age?

The output should look like:

{id="2", name="Ana", age=12}
{id="3", name="John", age=23}

The way I tried:

HashSet<Object> temp = new HashSet<>();
employee.removeIf(e->!temp.add(e.getName()));

..but the this way the first match will be kept in employee

{id="1", name="John", age=10}
{id="2", name="Ana", age=12}

...and I have no idea how to put an another condition here to keep the one with the largest age.

like image 698
LeeWay Avatar asked Oct 19 '18 08:10

LeeWay


2 Answers

Here's a way that groups elements by name and reduces groups by selecting the one with max age:

List<Employee> uniqueEmployees = employees.stream()
            .collect(Collectors.groupingBy(Employee::getName,
                    Collectors.maxBy(Comparator.comparing(Employee::getAge))))
        .values()
        .stream()
        .map(Optional::get)
        .collect(Collectors.toList());

Which returns [[id=2, name=Ana, age=12], [id=3, name=John, age=23]] with your test data.

like image 148
ernest_k Avatar answered Oct 09 '22 23:10

ernest_k


Apart from the accepted answer, here are two variants:

Collection<Employee> employeesWithMaxAge = employees.stream()
    .collect(Collectors.toMap(
             Employee::getName,
             Function.identity(),
             BinaryOperator.maxBy(Comparator.comparing(Employee::getAge))))
    .values();

This one uses Collectors.toMap to group employees by name, letting Employee instances as the values. If there are employees with the same name, the 3rd argument (which is a binary operator), selects the employee that has max age.

The other variant does the same, but doesn't use streams:

Map<String, Employee> map = new LinkedHashMap<>(); // preserves insertion order
employees.forEach(e -> map.merge(
        e.getName(), 
        e, 
        (e1, e2) -> e1.getAge() > e2.getAge() ? e1 : e2));

Or, with BinaryOperator.maxBy:

Map<String, Employee> map = new LinkedHashMap<>(); // preserves insertion order
employees.forEach(e -> map.merge(
        e.getName(), 
        e, 
        BinaryOperator.maxBy(Comparator.comparing(Employee::getAge))));
like image 41
fps Avatar answered Oct 09 '22 23:10

fps