Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I join two lists in less than O(N*M)?

Assume we have two tables (think as in SQL tables), where the primary key in one of them is the foreign key in the other. I'm supposed to write a simple algorithm that would imitate the joining of these two tables. I thought about iterating over each element in the primary key column in the first table, having a second loop where it checks if the foreign key matches, then store it in an external array or list. However, this would take O(N*M) and I need to find something better. There is a hint in the textbook that it involves hashing, however, I'm not sure how hashing could be implemented here or how it would make it better?

Editing to add an example:

Table Employees
|ID (Primary Key)| Name | Department|
|:---------------|:----:|----------:|
|       1        | Jack |    IT     |   
|       2        | Amy  |  Finance  |

Table Transactions
|Sold Product| Sold By(Foreign Key)| Sold To|
|:-----------|:-------------------:|-------:|
|    TV      |          1          |  Mary  |
|   Radio    |          1          |  Bob   |
|   Mobile   |          2          |  Lisa  |

What I want to do is write an algorithm that joins these two tables using hashing, and not anything complex, just a simple idea on how to do that.

like image 568
April Crude Avatar asked Jan 25 '23 04:01

April Crude


1 Answers

Populate a HashMap with the primary table elements, using their keys as the map key and the whole object as the value. This requires only 1 pass over the primary table and each operation to add to the HashMap occurs in constant time O(1). This is akin to creating a database index.

Iterate over the child table, “joining” to the parent row in constant time O(1) by getting the parent from the map.

Total runtime is 1 pass over each “table”, so O(n + m).

Code might look something like:

class Parent {
    int id;
    // other fields, getters etc
}

class Child {
    int parentId;
    // other fields, getters etc
}

void join(List<Parent> parents, List<Child> children) {
    Map<Integer, Parent> parentMap = parents.stream()
      .collect(toMap(Parent::getKey, p -> p)); // FYI toMap collects to a HashMap
    for (Child child : children) {
        Parent parent = parentMap.get(child.getParentId());
        // not sure what “join” means, but you now have child and its parent
    }
}
like image 137
Bohemian Avatar answered Jan 26 '23 17:01

Bohemian