Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Path reconstruction with Hashing?

I've been trying to figure out how to find a O(n) time complexity algorithm to solve the following in Java:

We are given an input pair with a start point and end point, and we have to construct a path such that the start of one input matches the end of another input (in this case, alphabetically)

EX: if I have a list

P B
B M
O P
M Z

where PB, or BM are pairs, P being the start point and B being the end

I should create as output

O P
P B
B M
M Z

And naturally, I want to check for cycles -- in this case I will just report if there is a cycle.

The first time I tried to do this I used a swapping algorithm that was O(N^2), by basically swapping two entries if there was a match and moving down the list. There is a definitely a faster way to do this, and I understand semi-intuitively how to do it but I was wondering if someone could clear it up a little.

Basically, I'm assuming you need to make some sort of hash/key type of structure, where you can reference the value of the object itself by a key.

For example, you would keep two collections, one of starts and one of ends. You could take each input and create an object that has a start and end field, and then add all of the starts and ends to two arrays. Then you would have to find basically end[start] for each start and add its respective object to a list after finding them.

My only thing is, I can't figure out exactly how to implement this in Java using a hash table or some similar data structure. Do I have to add the starts and ends to separate hash tables and then use the start points as lookup keys?

Pseudocode from looking at someone who solved it in python on github:

for inputs
    parse input 
    add parse[1] to starts, add parse[2] to ends

for starts
    find origin (a start not in ends) <--requires hash?

if no origin
    cycle exists

for inputs
    find ends[origin] <--requires hash?
    origin = ends[origin] <-- so we can find the next one

Wondering if someone could help me translate this to an algorithm in Java (other efficient solutions very welcome since I'm interested in this type of problem solving) or understanding it more generally from a data structures perspective.

like image 561
frei Avatar asked Jun 10 '15 00:06

frei


2 Answers

Here's a simple implementation using HashMaps in Java:

    String[][] paths = {
        {"P", "B"}, 
        {"B", "M"},
        {"O", "P"},
        {"M", "Z"},
    };

    // create a single hash map, mapping start->end
    HashMap<String, String> end = new HashMap<>();

    for(int i = 0; i < paths.length; i++)
    {
        end.put(paths[i][0], paths[i][1]);
    }

    // find if a cycle exists
    String origin = null;
    for (String key : end.keySet()) {
        if(!end.containsValue(key))
        {
            origin = key;
            break;
        }
    }

    if(origin == null)
    {
        System.out.println("cycle exists");
        return;  // no origin found, cycle exists
    }

    // iterate over hash map:
    int count = 0;
    while(true)
    {
        if(!end.containsKey(origin))
            break;
        String next = end.get(origin);
        System.out.println(origin + " " + next);
        origin = next;
        if(++count > paths.length)
        {
            System.out.println("cycle exists");
            break;
        }
    }

end stores the end point (value) given the start point (key).

If a point exists as a start point but not an end point then it will be a key in end but not a value in end. So iterating over the keySet of end and checking if end contains each key as a value will find an origin. If there is no such origin then there is a cycle. However, this is not sufficient to determine if there is a cycle because there may be a cycle even if there is an origin. Consider this set:

P B 
B M
O P
M Z
Z M

There is a unique origin (O) but there is also a cycle M -> Z -> M. To detect this you can walk the set and keep track of which point you have already visited, or more simply you can walk the set and if you end up visiting more points than the length of the input then there must be a cycle.

To walk the set, set a string to the origin. Then keep looking up the value in end given origin as the key, and print out the key, value pair (origin, end.get(origin)) as the start, end. Then set end.get(origin) as the new origin. The loop terminates when there is no value found in the end HashMap for the current origin.

This algorithm fails if there are duplicate start or end positions in the list, e.g:

P B 
B M
O P
M Z
B Z

It's not clear from the question if you have to handle this case.

like image 138
samgak Avatar answered Oct 10 '22 03:10

samgak


Although I would prefer constructing a linked-list like structure with something like

class Node {
    String name;
    Node prev;
    Node next;
}

it is still possible to do it with Map (HashMap if you want) only.

It looks similar to the answer from @samgak, just with some tricks that can make your code looks shorter (not necessary faster, haven't benchmarked it):

String[][] rawPaths = {
    {"P", "B"}, 
    {"B", "M"},
    {"O", "P"},
    {"M", "Z"},
};


// 1. You can keep only one `Map<String, String> which keeps start as key and end as value.

Map<String, String> pathMap = new HashMap<>();
for (String[] path : rawPaths) {
    pathMap.put(path[0], path[1]);
}

// 2. Some validity checks for data:

// 2.1 No of entry in final Map should equals to number of lines in raw data, which means there is no duplicated "start" node.
assert pathMap.size() == rawPaths.length;

// 2.2 There should be no duplicate in "end", i.e. no partial cyclic path
assert new HashSet<>(pathMap.values()).size() == rawPaths.length;

// 2.3 there should be one and only one start value that does not exists in end values, i.e. no full cyclic path
Set<String> startPoints = new HashSet<>(pathMap.keySet());
startPoints.removeAll(pathMap.values());

assert startPoints.size() == 1;

//3. Then you can make up the path by getting the "head" which is the only value left in startPoints, and construct the full path

String start= startPoints.iterator().next();

while (pathMap.contains(start)) {
    String end = pathMap.get(start);
    System.out.println(start + " " + end);
    start = end;
}

There may still be cases that there is cyclic island path like this:

A B
B C
C D
X Y
Y X

That can be easily checked by, after you traversed through part 3, keep an counter and the counter should finally be the same as pathMap.size(). If not, there is cyclic island

like image 1
Adrian Shum Avatar answered Oct 10 '22 05:10

Adrian Shum