Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I compare two MultiMaps?

I have two Multimaps which have been created from two huge CSV files.

Multimap<String, SomeClassObject> mapOne = ArrayListMultimap.create();
Multimap<String, SomeClassObject> mapTwo = ArrayListMultimap.create();

I have assumed one CSV column to be as a Key and each of the Key has thousands of values associated with it. Data contained within these Multimaps should be same. Now I want to compare the data within these Multimaps and find if any values are different. Here are the two approaches I am thinking of:

Approach One:

Make one big list from the Multimap. This big list will contain a few individual lists. Each of the smaller lists contains a unique value which is the "key" read from Multimap along with its associated values, which will form the rest of that individual list.

ArrayList<Collection<SomeClassObject>> bigList = new ArrayList<Collection<SomeClassObject>>();

Within bigList will be individual small lists A, B, C etc.

I plan on picking individual lists from each bigList of the two files on the basis of checking that individual list from second Multimap contains that "key" element. If it does, then compare both of these lists and find anything that could not be matched.

Approach Two:

Compare both the Multimaps but I am not sure how will that be done.

Which approach should have smaller execution time? I need the operation to be completed in minimum amount of time.

like image 335
user3044240 Avatar asked Aug 27 '15 16:08

user3044240


Video Answer


1 Answers

Use Multimaps.filterEntries(Multimap, Predicate).

If you want to get the differences between two Multimaps, it's very easy to write a filter based on containsEntry, and then use the filtering behavior to efficiently find all the elements that don't match. Just build the Predicate based on one map, and then filter the other.

Here's what I mean. Here, I'm using Java 8 lambdas, but you can look at the revision history of this post to see the Java 7 version:

public static void main(String[] args) {
  Multimap<String, String> first = ArrayListMultimap.create();
  Multimap<String, String> second = ArrayListMultimap.create();
  
  first.put("foo", "foo");
  first.put("foo", "bar");
  first.put("foo", "baz");
  first.put("bar", "foo");
  first.put("baz", "bar");
  
  second.put("foo", "foo");
  second.put("foo", "bar");
  second.put("baz", "baz");
  second.put("bar", "foo");
  second.put("baz", "bar");
       
  Multimap<String, String> firstSecondDifference =
      Multimaps.filterEntries(first, e -> !second.containsEntry(e.getKey(), e.getValue()));
  
  Multimap<String, String> secondFirstDifference =
      Multimaps.filterEntries(second, e -> !first.containsEntry(e.getKey(), e.getValue()));
  
  System.out.println(firstSecondDifference);
  System.out.println(secondFirstDifference);
}

Output is the element that is not in the other list, in this contrived example:

{foo=[baz]}
{baz=[baz]}

These multimaps will be empty if the maps match.


In Java 7, you can create the predicate manually, using something like this:

public static class FilterPredicate<K, V> implements Predicate<Map.Entry<K, V>> {
  private final Multimap<K, V> filterAgainst;

  public FilterPredicate(Multimap<K, V> filterAgainst) {
    this.filterAgainst = filterAgainst;
  }

  @Override
  public boolean apply(Entry<K, V> arg0) {
    return !filterAgainst.containsEntry(arg0.getKey(), arg0.getValue());
  }
}

Use it as an argument to Multimaps.filterEntries() like this:

Multimap<String, String> firstSecondDifference =
    Multimaps.filterEntries(first, new FilterPredicate(second));

Multimap<String, String> secondFirstDifference =
    Multimaps.filterEntries(second, new FilterPredicate(first));

Otherwise, the code is the same (with the same result) as the Java 8 version above.

like image 148
durron597 Avatar answered Oct 14 '22 14:10

durron597