Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java collection for this use case

Let's say we have a bunch of Car objects.

Each Car has some distinguishing properties e.g. manufacturer, model, year, etc. (these can be used to create distinct hashCodes).

Each car has a List of PurchaseOffer objects (a PurchaseOffer object contains pricing\retailer info).

We receive Lists of Cars from several different sources, each Car with a single PurchaseOffer. Thing is, these lists may overlap - a Car can appear in more than one list.

We wish to aggregate the lists into a single collection of Cars where each Car holds all encountered PurchaseOffers for it.

My Problem is choosing what to collection to use in this aggregation process:

Feels natural to use java.util.HashSet for holding our cars, that way when going over the different lists of Cars, we can check if a car already exists in the Set in amortized O(1), however - you cannot retrieve an element from a Set (in our case - when we go encounter a Car that already exists in the Set - we would have liked to retrieve that Car from the Set based on its identifying hashCode and add PurchaseOffers to it).

I can use a HashMap where each Car's hashCode maps to the actual Car object, but it probably isn't the school-book solution since it is unsafe - I would have to make sure myself that every hashCode maps to a Car with that hashCode - there could be inconsistency. Of course, can make a designated data structure that guarantees this consistency - Shouldn't one already exist ?

Can anyone suggest the data-structure I am after, or point out a design mistake ? Thanks.

like image 471
bloodcell Avatar asked Apr 05 '11 09:04

bloodcell


4 Answers

Since this is a many-to-many relationship, you need a bi-directional multi-map. Car is the key for the first one, with a List of PurchaseOrder as the value. The PurchaseOrder is the key for the second one, with a List of Cars as the value.

The underlying implementation is two HashMaps.

Put an API on top of it to get the behavior you need. Or see if Google Collections can help you. It's a combination of a BiMap and two MultiMaps.

like image 103
duffymo Avatar answered Oct 01 '22 16:10

duffymo


I think that you really do need (at least) a HashMap<Car, List<PurchaseOffer>> ... as suggested by @Andreas_D

Your objection that each Car already has a List<PurchaseOffer> is beside the point. The list in the HashMap is the aggregate list, containing all PurchaseOffer objects from all Car objects that stand for the same physical car.

The point of creating a new list is to avoid changing the original lists on the original Car objects. (If that was not a concern, then you could pick one instance of Car from the set that represent a physical car, and merge the PurchaseOffer objects from the others into that list.)

I'm not entirely sure why @duffymo suggested a bi-directional map between, but I think it is because the different Car objects from different sources may have complementary (or contradictory) information for the same physical car. By keeping all instances, you avoid discarding information. (Once again, if you are happy to discard mutate and/or discard information, you could attempt to merge the information about each individual car into a single Car object.


If you really didn't care about preserving information and were prepared to merge stuff willy-nilly then the following approach would probably work:

  HashMap<Car, Car> map = new HashMap<Car, Car>(...);
  for (Car car : carsToBeAggregated) {
      Car master = nap.get(car);
      if (master == null) {
          map.put(car, car);
      } else {
          master.offers.addAll(car.offers);
          // optionally, merge other Car information from car to master
      }
  }

You should NOT be trying to use the Car.hashCode() as a key for anything. Hashcode values are not unique identifiers: there is a distinct possibility that two different cars will end up with the same hashcode value. If you attempt to use them as if they were unique identifiers you'll get into trouble ...

like image 20
Stephen C Avatar answered Oct 01 '22 17:10

Stephen C


The basic datastructure should be a HashMap<Car, List<PurchaseOffer>>. This allows for storing and receiving all offers for one selected car.

Now you may have to find a suitable implementation for Car.equals() to assure, that "cars" coming from different source are really the same. What about basing equals() on a unique identifier for a real world car (VIN)?

like image 27
Andreas Dolk Avatar answered Oct 01 '22 16:10

Andreas Dolk


I would prefer to use a HashMap<Car, List<PurchaseOffer>>, as suggested before (Andreas, Stephen), mainly if the Car object does not hold the list of PurchaseOffers.
Otherwise I would consider using a HashMap<Car, Car> or, better IMO, a HashMap<ID, Car> if there is an unique ID for each Car.

It can not simply map the Car's hashCode to the Car, as mentioned in the question, since distinct Cars can have the same hashCode!

(Anyway, I would create an own class for storing and managing the Cars. This would contain the HashMap, or whichever - so it's easy to change the implementation without needing to change its interface)

like image 33
user85421 Avatar answered Oct 01 '22 18:10

user85421