Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How Best to Compare Two Collections in Java and Act on Them?

I have two collections of the same object, Collection<Foo> oldSet and Collection<Foo> newSet. The required logic is as follow:

  • if foo is in(*) oldSet but not newSet, call doRemove(foo)
  • else if foo is not in oldSet but in newSet, call doAdd(foo)
  • else if foo is in both collections but modified, call doUpdate(oldFoo, newFoo)
  • else if !foo.activated && foo.startDate >= now, call doStart(foo)
  • else if foo.activated && foo.endDate <= now, call doEnd(foo)

(*) "in" means the unique identifier matches, not necessarily the content.

The current (legacy) code does many comparisons to figure out removeSet, addSet, updateSet, startSet and endSet, and then loop to act on each item.

The code is quite messy (partly because I have left out some spaghetti logic already) and I am trying to refactor it. Some more background info:

  • As far as I know, the oldSet and newSet are actually backed by ArrayList
  • Each set contains less than 100 items, most likely max out at 20
  • This code is called frequently (measured in millions/day), although the sets seldom differ

My questions:

  • If I convert oldSet and newSet into HashMap<Foo> (order is not of concern here), with the IDs as keys, would it made the code easier to read and easier to compare? How much of time & memory performance is loss on the conversion?
  • Would iterating the two sets and perform the appropriate operation be more efficient and concise?
like image 911
ckpwong Avatar asked Aug 22 '08 20:08

ckpwong


People also ask

What is used to compare two data collections?

A double bar graph can be used to compare two collections of data.

How do you compare two sets of elements in Java?

The equals() method of java. util. Set class is used to verify the equality of an Object with a Set and compare them. The method returns true if the size of both the sets are equal and both contain the same elements.

Which is faster collections in Java?

If you need fast access to elements using index, ArrayList should be choice. If you need fast access to elements using a key, use HashMap. If you need fast add and removal of elements, use LinkedList (but it has a very poor seeking performance).


2 Answers

Apache's commons.collections library has a CollectionUtils class that provides easy-to-use methods for Collection manipulation/checking, such as intersection, difference, and union.

The org.apache.commons.collections.CollectionUtils API docs are here.

like image 115
user143081 Avatar answered Sep 24 '22 08:09

user143081


You can use Java 8 streams, for example

set1.stream().filter(s -> set2.contains(s)).collect(Collectors.toSet()); 

or Sets class from Guava:

Set<String> intersection = Sets.intersection(set1, set2); Set<String> difference = Sets.difference(set1, set2); Set<String> symmetricDifference = Sets.symmetricDifference(set1, set2); Set<String> union = Sets.union(set1, set2); 
like image 31
Vitalii Fedorenko Avatar answered Sep 23 '22 08:09

Vitalii Fedorenko