I've been trying to figure out how to find a O(n) time complexity algorithm to solve the following in Java:
We are given an input pair with a start point and end point, and we have to construct a path such that the start of one input matches the end of another input (in this case, alphabetically)
EX: if I have a list
P B
B M
O P
M Z
where PB, or BM are pairs, P being the start point and B being the end
I should create as output
O P
P B
B M
M Z
And naturally, I want to check for cycles -- in this case I will just report if there is a cycle.
The first time I tried to do this I used a swapping algorithm that was O(N^2), by basically swapping two entries if there was a match and moving down the list. There is a definitely a faster way to do this, and I understand semi-intuitively how to do it but I was wondering if someone could clear it up a little.
Basically, I'm assuming you need to make some sort of hash/key type of structure, where you can reference the value of the object itself by a key.
For example, you would keep two collections, one of starts and one of ends. You could take each input and create an object that has a start and end field, and then add all of the starts and ends to two arrays. Then you would have to find basically end[start] for each start and add its respective object to a list after finding them.
My only thing is, I can't figure out exactly how to implement this in Java using a hash table or some similar data structure. Do I have to add the starts and ends to separate hash tables and then use the start points as lookup keys?
Pseudocode from looking at someone who solved it in python on github:
for inputs
parse input
add parse[1] to starts, add parse[2] to ends
for starts
find origin (a start not in ends) <--requires hash?
if no origin
cycle exists
for inputs
find ends[origin] <--requires hash?
origin = ends[origin] <-- so we can find the next one
Wondering if someone could help me translate this to an algorithm in Java (other efficient solutions very welcome since I'm interested in this type of problem solving) or understanding it more generally from a data structures perspective.
Here's a simple implementation using HashMaps in Java:
String[][] paths = {
{"P", "B"},
{"B", "M"},
{"O", "P"},
{"M", "Z"},
};
// create a single hash map, mapping start->end
HashMap<String, String> end = new HashMap<>();
for(int i = 0; i < paths.length; i++)
{
end.put(paths[i][0], paths[i][1]);
}
// find if a cycle exists
String origin = null;
for (String key : end.keySet()) {
if(!end.containsValue(key))
{
origin = key;
break;
}
}
if(origin == null)
{
System.out.println("cycle exists");
return; // no origin found, cycle exists
}
// iterate over hash map:
int count = 0;
while(true)
{
if(!end.containsKey(origin))
break;
String next = end.get(origin);
System.out.println(origin + " " + next);
origin = next;
if(++count > paths.length)
{
System.out.println("cycle exists");
break;
}
}
end
stores the end point (value) given the start point (key).
If a point exists as a start point but not an end point then it will be a key in end
but not a value in end
. So iterating over the keySet of end
and checking if end
contains each key as a value will find an origin. If there is no such origin then there is a cycle. However, this is not sufficient to determine if there is a cycle because there may be a cycle even if there is an origin. Consider this set:
P B
B M
O P
M Z
Z M
There is a unique origin (O) but there is also a cycle M -> Z -> M. To detect this you can walk the set and keep track of which point you have already visited, or more simply you can walk the set and if you end up visiting more points than the length of the input then there must be a cycle.
To walk the set, set a string to the origin. Then keep looking up the value in end
given origin
as the key, and print out the key, value pair (origin, end.get(origin)) as the start, end. Then set end.get(origin) as the new origin. The loop terminates when there is no value found in the end HashMap for the current origin.
This algorithm fails if there are duplicate start or end positions in the list, e.g:
P B
B M
O P
M Z
B Z
It's not clear from the question if you have to handle this case.
Although I would prefer constructing a linked-list like structure with something like
class Node {
String name;
Node prev;
Node next;
}
it is still possible to do it with Map (HashMap if you want) only.
It looks similar to the answer from @samgak, just with some tricks that can make your code looks shorter (not necessary faster, haven't benchmarked it):
String[][] rawPaths = {
{"P", "B"},
{"B", "M"},
{"O", "P"},
{"M", "Z"},
};
// 1. You can keep only one `Map<String, String> which keeps start as key and end as value.
Map<String, String> pathMap = new HashMap<>();
for (String[] path : rawPaths) {
pathMap.put(path[0], path[1]);
}
// 2. Some validity checks for data:
// 2.1 No of entry in final Map should equals to number of lines in raw data, which means there is no duplicated "start" node.
assert pathMap.size() == rawPaths.length;
// 2.2 There should be no duplicate in "end", i.e. no partial cyclic path
assert new HashSet<>(pathMap.values()).size() == rawPaths.length;
// 2.3 there should be one and only one start value that does not exists in end values, i.e. no full cyclic path
Set<String> startPoints = new HashSet<>(pathMap.keySet());
startPoints.removeAll(pathMap.values());
assert startPoints.size() == 1;
//3. Then you can make up the path by getting the "head" which is the only value left in startPoints, and construct the full path
String start= startPoints.iterator().next();
while (pathMap.contains(start)) {
String end = pathMap.get(start);
System.out.println(start + " " + end);
start = end;
}
There may still be cases that there is cyclic island path like this:
A B
B C
C D
X Y
Y X
That can be easily checked by, after you traversed through part 3, keep an counter and the counter should finally be the same as pathMap.size()
. If not, there is cyclic island
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With