Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"Partially" sorting list of POJO

I have a List of objects of the following class:

public class Foo {
    private Date date;
    private String name;
    private Long number;
}

This list is fetched from a database with order by date asc, number desc, but the part that need to be retained all the time is the ordering by date asc.

Example of the result (Dateformat = MM/dd/yyyy):

01/01/2016  Name1   928562
01/01/2016  Name2   910785
01/01/2016  Name3   811290
01/01/2016  Name4   811289
01/01/2016  Name5   5000000
02/01/2016  Name3   877702
02/01/2016  Name1   852960
02/01/2016  Name2   749640
02/01/2016  Name4   749500
02/01/2016  Name5   5000000

Now I want to order that list so that it results in:

01/01/2016  Name2   910785
01/01/2016  Name1   928562
01/01/2016  Name3   811290
01/01/2016  Name4   811289
01/01/2016  Name5   5000000
02/01/2016  Name2   749640
02/01/2016  Name1   852960
02/01/2016  Name3   877702
02/01/2016  Name4   749500
02/01/2016  Name5   5000000

As you can see, it is now sorted by date ascending and the name. The order for the names is stored in another List (NameSortingList):

Name2
Name1
Name3

Note that Name4 and Name5 are missing in the NameSortingList, can't be added to it and therefore should be added after everything that is ordered. Everything that comes after the ordered list can have any order.

If it makes it easier, everything that is not in the lsit can be merged into one Foo per unique date with name = "Other" which sums up the Numbers of all the elements in it. An example for a result like that:

01/01/2016  Name2   910785
01/01/2016  Name1   928562
01/01/2016  Name3   811290
01/01/2016  Other   5811289
02/01/2016  Name2   749640
02/01/2016  Name1   852960
02/01/2016  Name3   877702
02/01/2016  Other   5749500

My Current approach to this sorting is to extract all dates as unqiue values first, then build the NameSortingList and then iterate the data multipel times to add the data in the right ordering. The issue I have is

  1. It can miss entries if the name is not existing in the NameSortingList
  2. The performance is really really bad

data is a list of Foo as described at the very top:

List<String> sortedNames = data.stream().filter(e -> e.getDate().equals(getCurrentMonthDate()))
        .map(e -> e.getName()).collect(Collectors.toCollection(ArrayList<String>::new));

Set<Date> uniqueDates = data.stream().map(e -> e.getDate())
        .collect(Collectors.toCollection(LinkedHashSet<Date>::new));

List<Foo> sortedFoo= new ArrayList<Foo>();
for (Date d : uniqueDates) {
    for (String name : sortedNames) {
        for (Foo fr : data) {
            if (fr.Date().equals(d) && fr.getName().equals(name)) {
                sortedFoo.add(fr);
                break;
            }
        }
    }
}

How can I fix the 2 issues I described? Maybe there is even a stream solution to this which I couldn't wrap my head around?


If you have any questions, feel free to ask

like image 989
XtremeBaumer Avatar asked Nov 02 '18 12:11

XtremeBaumer


2 Answers

If I understand correctly, you have a list from the database that is sorted by date asc, number desc by default due to the query. You now want to sort it on date asc, name desc where the name are not sorted alphabetically, but based on the order they're in in the nameSortingList (where the names not in this list will be sorted at the end)?

If that is indeed the case, how about:

myList.sort(Comparator.comparing(Foo::getDate)
                      .thenComparing(foo-> {
  int index = nameSortingList.indexOf(foo.getName());
  return i == -1 ? // If not found, it should be sorted as trailing instead of leading name
    Integer.MAX_VALUE
   : // Otherwise, sort it on the index in the nameSortingList:
    i;} ));

EDIT: As correctly pointed out by @tobias_k in the comments. It might be best to first create a Map for your nameSortingList, where the names are the keys, and the index in the nameSortingList are the value. This will be better for performance, so you can change it to this instead:

myList.sort(Comparator.comparing(Foo::getDate)
                      .thenComparing(foo-> nameSortingMap.getOrDefault(foo.getName(), Integer.MAX_VALUE));

Although I doubt it will make that big of a difference for small lists.

like image 189
Kevin Cruijssen Avatar answered Oct 12 '22 22:10

Kevin Cruijssen


Create a secondary map establishing the name->ordinal order, then use that in your sorting. As specified, all missing names should be given the same order -- at the end. If this is not desired, you should add them dynamically first.

public class Sorter {
    static String input[] = {
        "01/01/2016  Name1   928562",
        "01/01/2016  Name2   910785",
        "01/01/2016  Name3   811290",
        "01/01/2016  Name4   811289",
        "02/01/2016  Name3   877702",
        "02/01/2016  Name1   852960",
        "02/01/2016  Name2   749640",
        "02/01/2016  Name4   749500",
        "02/01/2016  Name5   5000000"
    };
    static String names[] = { "Name2", "Name1", "Name3" };
    static class Foo {
        private Date date;
        private String name;
        private Long number;
        @Override
        public String toString() {
            return "Foo{" + "date=" + date + ", name=" + name + ", number=" + number + '}';
        }
    }
    static Foo parseInput(String s) throws Exception {
        Foo result = new Foo();
        String[] strs = s.split("  *");
        result.date = new SimpleDateFormat("dd/MM/yyyy").parse(strs[0]);
        result.name = strs[1];
        result.number = Long.parseLong(strs[2]);
        return result;
    }
    static class NameOrderCompare implements Comparator<Foo> {
        final Map<String,Integer> nameOrder = new HashMap<>();
        NameOrderCompare(String names[]) {
            for (String name : names) {
                nameOrder.put(name, nameOrder.size());
            }
        }
        @Override
        public int compare(Foo foo1, Foo foo2) {
            int cmp = foo1.date.compareTo(foo2.date);
            if (cmp != 0) return cmp;
            Integer order1 = nameOrder.getOrDefault(foo1.name, Integer.MAX_VALUE);
            Integer order2 = nameOrder.getOrDefault(foo2.name, Integer.MAX_VALUE);
            return order1 - order2;
        }
    }
    public static void main(String[] args) throws Exception {
        List<Foo> foos = new ArrayList<>();
        for (String s : input) {
            foos.add(parseInput(s));
        }
        Collections.sort(foos, new NameOrderCompare(names));
        for (Foo foo : foos) {
            System.out.println(foo);
        }
    }
}

When run this produces:

Foo{date=Fri Jan 01 00:00:00 MST 2016, name=Name2, number=910785}
Foo{date=Fri Jan 01 00:00:00 MST 2016, name=Name1, number=928562}
Foo{date=Fri Jan 01 00:00:00 MST 2016, name=Name3, number=811290}
Foo{date=Fri Jan 01 00:00:00 MST 2016, name=Name4, number=811289}
Foo{date=Sat Jan 02 00:00:00 MST 2016, name=Name2, number=749640}
Foo{date=Sat Jan 02 00:00:00 MST 2016, name=Name1, number=852960}
Foo{date=Sat Jan 02 00:00:00 MST 2016, name=Name3, number=877702}
Foo{date=Sat Jan 02 00:00:00 MST 2016, name=Name4, number=749500}
Foo{date=Sat Jan 02 00:00:00 MST 2016, name=Name5, number=5000000}

It should be noted that this does not take advantage of the fact that the input is already sorted by date. While it is not important for small data sets like this, it can matter especially when you have so much data that it should spill to disk. In that case you can adopt a blockwise strategy: read blocks by date, order them by name, output each sorted block. If blocks are small enough to fit in memory this can save the effort of a disk-based sort.

Sorry this answer is so verbose. There are doubtless more clever and compact ways to achieve the same end, but this should illustrate the concept clearly.

like image 28
Wheezil Avatar answered Oct 13 '22 00:10

Wheezil