You have N nodes, labeled 1…N (N <= 50000). Each node has a designated ID in the range 1…M, where M is at most 50. You are trying to find the shortest path from 1 to N.
You are also given a matrix M x M, to determine if there is are directed edges between nodes of different IDs. M_ij = ‘Y’ if a node of ID i has a directed edge to a node of ID j, and ’N’ if not. Note that M_ij may not equal M_ji, and that M_ii could be = 0.
If there is an edge between two nodes, then the weight of the directed edge between them is the difference between their labels.
Example:
There are 5 nodes.
Node 1: ID 1
Node 2: ID 4
Node 3: ID 2
Node 4: ID 3
Node 5: ID 4
Matrix:
YNYN
NNNY
NYYN
NYNN
So in this matrix, we know that nodes of ID 1 have directed edges to nodes of ID 1 and ID 3. Nodes of ID 2 have directed edges to nodes of ID 4. Nodes of ID 3 have directed edges to nodes of ID 2 and ID 3. And nodes of ID 4 have directed edges to nodes of ID 2.
The answer = 6.
The shortest path between Nodes 1 and 5 is through a directed edge from Node 1 to Node 4, of weight 4 - 1 = 3. Note that this edge is possible because Node 1 is of ID 1, and Node 4 is of ID 3, and nodes of ID 1 have directed edges to all nodes of ID 3. Next, you would use the directed edge from Node 4 to Node 3, of weight 4 - 3 = 1; then the directed edge from Node 3 to Node 5, of weight 5 - 3 = 2. 3 + 1 + 2 = 6, so the total shortest path is 6.
My solution:
When I first read this problem, I thought that Dijkstra’s algorithm would solve it easily. However, its time complexity is N^2logN, and the number of edges could be at most 50000^2 which is too high. How can I solve the shortest path problem in a dense graph? Is there a way to optimize Dijkstra by utilizing the matrix in some way?
Lets say there are 3 nodes with different labels who share the same id 1:
ID 1 <==> Labels {1, 7, 13}
Similary, there are 5 nodes who share the same id 2:
ID 2 <==> Labels {2, 8, 12, 15}
3 nodes who share the same id 3:
ID 3 <==> Labels {5, 10, 11}
and 2 nodes who share the same id 4:
ID 4 <==> Labels {6, 9}
Now lets say we have 4 edges:
If we want to find out the smallest distance between labels '7' and '9', how would we do it?
Well label 7 means ID1 and label 9 means ID4. That means we need to find the shortest distance between ID1 and ID4.
But since there is no direct edge between these 2 ids, we would have to jump to ID 2 or 3 first.
Lets calculate the distance to ID 2:
Which label should we jump to within ID2 from ID1?
Since distance is defined as the absolute difference between 2 label values, and source label is '7', we should jump at a label which is closest to value to 7.
Among {1, 2, 8, 12, 15}, this would be '8' (distance = 8-7 = 1)
Lets calculate the distance to ID 3:
Which label should we jump to within ID3 from ID1?
Among {5, 10, 11}, the number closest to 7 would be '5' (distance = 7-5 = 2)
Since distance to ID2 is smaller than ID3, we should jump to ID2, label 8.
Now that we are at label 8, we can directly jump to label 9 in ID4 (distance = 9-8 = 1)
Total distance = 1 + min(1, 2) + 1 = 3
The general algorithm would be:
Finding a number closest to a value in a sorted array can be done in O(logn), where n is the size of the sorted array.
The algorithmic complexity for dijkstra would be O(M^2 * logM * logX), where X = average number of labels sharing a single ID, which can be safely assumed to be N / M.
Creating hashmap with sorted lists would take O(M*X*logX) = O(N * logN/M) complexity.
So total complexity = O((N * log(N/M)) + (M^2 * logM * log(N/M))
~ O(log(N/M) * (N + M^2)
Disclaimer:
As I understand your formulation, edge ID1 -> ID2 means that from a node with ID1 there is an edge to every other node with ID2. If this is not correct, simply ignore the rest. For now I will assume this to be the truth.
Observation:
Since the edge ID1->ID2 means that every node with ID1 has an edge to every other node with ID2 it doesn't matter which node with certain ID we use - they are all equivalent for the shortest path purpose.
This means that to compute the length of the shortest path from a node Label_X, ID_I to a node Label_Y, ID_J we can simply compute the Djikstra over a graph that considers every label present exactly once to find a shortest path over labels. This will take O(M * log(M^2)) operations.
Now, since we have the path of labels we can simply select a first node, with the required label, that we encountered. Since we have to read the input anyways, we can simply create an array with a first node of every ID that we encounter. The reading of the input will take ~ N + M^2 operations
The total complexity in this case would be O(N + M^2 + M * log(M^2)). In your case the total would be 50000 + 2500 + 50 * log(2500) = ~ 53100.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With