Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance of element-compare in java collections

Problem:

Given two Collection<?>s, check if both contain the same elements.

  • Assuming that the actual implementation of the collection is unknown
  • Assuming that the elements do not occur in the same order
  • Assuming that no element does occur twice in the same collection

Solution 1:

boolean equals = c1.containsAll(c2) && c2.containsAll(c1);

Solution 2:

boolean equals = new HashSet<?>(c1).equals(new HashSet<?>(c2));

I would assume Solution 2 to be more efficient (O(n)) than Solution 1 (O(n^2)).

Am I correct or did I miss something?

like image 798
Stefan Winkler Avatar asked Jun 08 '15 14:06

Stefan Winkler


1 Answers

The Big O complexity of these is correct, Solution 1 involves iterating over one list (O (n)) for each item in the other list, which is O (n^2). Solution 2 involves two O (n) copies and iterating over one set (O (n)) and doing O (1) .contains() checks on the other set. All told, that's just O (n).

But depending on your constraints you can do better (not asymptotically better, just a better implementation).

  • Since we're assuming no duplicate elements, there's no need to do the second .containsAll() check. Simply check they're the same size (which might be O (n), but it's still better than duplicating an O (n^2) check) then do the .containsAll().

  • There's no need to convert c2 into a Set, since it will be iterated over anyways; just convert c1 and use .containsAll().

  • You can use instanceof Set to test if c1 or c2 is already a Set, and use that object's .containsAll() method; this will run in O (n) time even if the other object is not a set, and avoids the copying overhead that Solution 2 has.

like image 198
dimo414 Avatar answered Oct 21 '22 06:10

dimo414