Assume we have some items in a collection and we want to sort them using certain comparator, expecting result in a list:
Collection<Item> items = ...; Comparator<Item> itemComparator = ...;
One of the approaches is to sort items in a list, something like:
List<Item> sortedItems = new ArrayList<>(items); Collections.sort(sortedItems, itemComparator);
Anothe approach is using a sorted stream:
List<Item> sortedItems = items .stream() .sorted(itemComparator) .collect(Collectors.toList());
I wonder, which approach is more efficient? Are there any advantages of a sorted stream (like faste sorting on multiple cores)?
Efficient in a sense of runtime complexity/fastest.
I don't trust myself to implement a perfect benchmark and studying SortedOps
did not really enlighten me.
Collections class sort() method is used to sort a list in Java. We can sort a list in natural ordering where the list elements must implement Comparable interface. We can also pass a Comparator implementation to define the sorting rules.
The sorting time complexity is O(n), where n is the size of streaming data and the space complexity is O(M), where M is the size of working memory. Moreover, unlike the external sort, streaming data sort does not require any external space for the merge process.
Stream sorted() in JavaFor ordered streams, the sort method is stable but for unordered streams, no stability is guaranteed. It is a stateful intermediate operation i.e, it may incorporate state from previously seen elements when processing new elements.
To be honest I don't trust myself too much either in JMH
(unless I understand the assembly, which takes lots of time in my case), especially since I've used @Setup(Level.Invocation)
, but here is a small test (I took the StringInput
generation from some other test I did, but it should not matter, it's just some data to sort)
@State(Scope.Thread) public static class StringInput { private String[] letters = { "q", "a", "z", "w", "s", "x", "e", "d", "c", "r", "f", "v", "t", "g", "b", "y", "h", "n", "u", "j", "m", "i", "k", "o", "l", "p" }; public String s = ""; public List<String> list; @Param(value = { "1000", "10000", "100000" }) int next; @TearDown(Level.Invocation) public void tearDown() { s = null; } @Setup(Level.Invocation) public void setUp() { list = ThreadLocalRandom.current() .ints(next, 0, letters.length) .mapToObj(x -> letters[x]) .map(x -> Character.toString((char) x.intValue())) .collect(Collectors.toList()); } } @Fork(1) @Benchmark public List<String> testCollection(StringInput si){ Collections.sort(si.list, Comparator.naturalOrder()); return si.list; } @Fork(1) @Benchmark public List<String> testStream(StringInput si){ return si.list.stream() .sorted(Comparator.naturalOrder()) .collect(Collectors.toList()); }
Results show that Collections.sort
is faster, but not by a big margin:
Benchmark (next) Mode Cnt Score Error Units streamvsLoop.StreamVsLoop.testCollection 1000 avgt 2 0.038 ms/op streamvsLoop.StreamVsLoop.testCollection 10000 avgt 2 0.599 ms/op streamvsLoop.StreamVsLoop.testCollection 100000 avgt 2 12.488 ms/op streamvsLoop.StreamVsLoop.testStream 1000 avgt 2 0.048 ms/op streamvsLoop.StreamVsLoop.testStream 10000 avgt 2 0.808 ms/op streamvsLoop.StreamVsLoop.testStream 100000 avgt 2 15.652 ms/op
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With