Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"Left join" of two different Java objects

Tags:

java

left-join

I have a list of Object1 (List<Object1>) and a list of Object2 (List<Object2>)

  • Object 1 has multiple properties, including id
  • Object 2 has multiple properites, including object1id

I have some SQL background and what I'm trying to do is to perform a "left join" on

object1.id = object2.object1id

This would result in a List<Object3> that represents the left join. I could hardcode an algorithm in Java (for... for...), but I'm sure this wouldn't be efficient with at least a complexity of n*m.

Do you have a better solution? (with code if possible, thanks!)

like image 764
Stephane Maarek Avatar asked Aug 31 '14 15:08

Stephane Maarek


5 Answers

You are trying to do something that Java is not really meant for.

If you are able to do it, you would be better off adding an attribute to Object1, which would be a list of Object2 containing the objects related to this.

If you can't, we still have the option of doing it naively, else you could try something like that:

HashSet<Integer> hs = new HashSet<Integer>(list2.size());
for(Object2 o : list2) {
    hs.add(o.object1id);
}
//hs contains all the ids of list2
List<Object1> result = new ArrayList<Object1>(); //Or another class implementing List
for(Object1 o : list1) {
    if(hs.contains(o.id))
        result.add(o);
}

Not pretty since you have to store all the ids in an HashSet, but since adding and accessing elements in HashSet are O(1) (theoretically), the algorithm is O(n+m)

If your Object3 class is constructed with an Object1 and Object2, use an HasMap instead of HashSet where the keys are ids, and the values object2. The last for loop in the code will become:

Object2 o2 = hs.get(o.id);
if(o2 != null)
    result.add(new Object3(o, o2);

Further to Óscar López comment:

If your objectid1 your not unique, you have to adapt the code as follows:

HashMap<Integer, List<Object2>> hm = new HashMap<Integer, List<Object2>>();
for(Object2 o : list2) {
    List<Object2> l = hm.get(o.objectid1);
    if(l != null) {
        l.add(o);
    } else {
        List<Object2> l = new ArrayList<Object2>();
        l.add(o);
        hm.put(o.objectid1, l);
}
//hm is map, where each entry contains the list of Object2 associated with objectid1
List<Object1> result = new ArrayList<Object1>();
for(Object1 o : list1) {
    List<Object2> l = hm.get(o.id);
    //l contains all Object2 with object1id = o.id
    for(Object2 o2 : l)
        result.add(new Object3(o, o2));
}

Still in O(n+m), but with bigger constants...

like image 174
R2B2 Avatar answered Nov 12 '22 18:11

R2B2


If you're using Java 8, you can leverage streams. It might look something like this (assuming id is the id of the Object1 to look up):

List<Object3> newList = obj2List.stream().filter(x -> x.object1id == id).map(x -> obj2To3(x)).collect(Collectors.toList());

The case provided is pretty vague, so it's hard to give a more detailed answer.

like image 39
Dogsworth Avatar answered Nov 12 '22 18:11

Dogsworth


Create an index on List. Scan List and fill the index:

HashMap<Integer, Object2> index=HashMap<Integer, Object2>();
for (Object2 obj2: list2) {
   index.put(obj2.object1id, obj2);
}

Then, scan the List and do the join:

for (Object1 obj1: list1) {
   Object2 obj2=index.get(obj1.id); // may be null
   Object3 obj3=new Object3(obj1, obj2);
}
like image 26
Alexei Kaigorodov Avatar answered Nov 12 '22 17:11

Alexei Kaigorodov


A good solution might be converting the List of Object2 to a Map. Then iterate through the Object1 List and get the Object2 from Map, ultimately creating the Join and adding result in Object3 List.

like image 1
Rami Del Toro Avatar answered Nov 12 '22 17:11

Rami Del Toro


I believe an O(n*m) solution is inevitable, unless a more complex data structure infrastructure is created - efficient joins in a database are implemented using indexes, hashes, etc. Also bear in mind that a correct implementation should consider the case where more than one object in list2 has the same object1id - my code works in this case, but all the solutions that simply add obj2.object1id to a Set or as keys in a Map, will fail.

But is the implementation complexity worth it? if the input lists are small an O(n*m) solution will work just fine. Here's my proposal, using good old nested loops:

List<Object3> list3 = new ArrayList<>();
for (Object1 obj1 : list1) {
    boolean found = false;
    for (Object2 obj2 : list2) {
        if (obj1.id.equals(obj2.object1id)) {
            list3.add(new Object3(obj1, obj2));
            found = true;
        }
    }
    if (!found)
        list3.add(new Object3(obj1, null));
}

For the above to work, I'm using an output object that looks like this:

public class Object3 {
    private Object1 obj1;
    private Object2 obj2;
    public Object3(Object1 obj1, Object2 obj2) {
        this.obj1 = obj1;
        this.obj2 = obj2;
    }
}
like image 1
Óscar López Avatar answered Nov 12 '22 17:11

Óscar López