Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimized dijkstra’s algorithm for an extremely dense graph with matrix

You have N nodes, labeled 1…N (N <= 50000). Each node has a designated ID in the range 1…M, where M is at most 50. You are trying to find the shortest path from 1 to N.

You are also given a matrix M x M, to determine if there is are directed edges between nodes of different IDs. M_ij = ‘Y’ if a node of ID i has a directed edge to a node of ID j, and ’N’ if not. Note that M_ij may not equal M_ji, and that M_ii could be = 0.

If there is an edge between two nodes, then the weight of the directed edge between them is the difference between their labels.

Example:

There are 5 nodes.

Node 1: ID 1

Node 2: ID 4

Node 3: ID 2

Node 4: ID 3

Node 5: ID 4

Matrix:

YNYN

NNNY

NYYN

NYNN

So in this matrix, we know that nodes of ID 1 have directed edges to nodes of ID 1 and ID 3. Nodes of ID 2 have directed edges to nodes of ID 4. Nodes of ID 3 have directed edges to nodes of ID 2 and ID 3. And nodes of ID 4 have directed edges to nodes of ID 2.

The answer = 6.

The shortest path between Nodes 1 and 5 is through a directed edge from Node 1 to Node 4, of weight 4 - 1 = 3. Note that this edge is possible because Node 1 is of ID 1, and Node 4 is of ID 3, and nodes of ID 1 have directed edges to all nodes of ID 3. Next, you would use the directed edge from Node 4 to Node 3, of weight 4 - 3 = 1; then the directed edge from Node 3 to Node 5, of weight 5 - 3 = 2. 3 + 1 + 2 = 6, so the total shortest path is 6.

My solution:

When I first read this problem, I thought that Dijkstra’s algorithm would solve it easily. However, its time complexity is N^2logN, and the number of edges could be at most 50000^2 which is too high. How can I solve the shortest path problem in a dense graph? Is there a way to optimize Dijkstra by utilizing the matrix in some way?

like image 272
cracra Avatar asked Nov 20 '25 00:11

cracra


2 Answers

Lets say there are 3 nodes with different labels who share the same id 1:
ID 1 <==> Labels {1, 7, 13}

Similary, there are 5 nodes who share the same id 2:
ID 2 <==> Labels {2, 8, 12, 15}

3 nodes who share the same id 3:
ID 3 <==> Labels {5, 10, 11}

and 2 nodes who share the same id 4:
ID 4 <==> Labels {6, 9}

Now lets say we have 4 edges:

  • ID1 -> ID2
  • ID1 -> ID3
  • ID2 -> ID4
  • ID3 -> ID4

If we want to find out the smallest distance between labels '7' and '9', how would we do it?

Well label 7 means ID1 and label 9 means ID4. That means we need to find the shortest distance between ID1 and ID4.

But since there is no direct edge between these 2 ids, we would have to jump to ID 2 or 3 first.

Lets calculate the distance to ID 2:
Which label should we jump to within ID2 from ID1?
Since distance is defined as the absolute difference between 2 label values, and source label is '7', we should jump at a label which is closest to value to 7.
Among {1, 2, 8, 12, 15}, this would be '8' (distance = 8-7 = 1)

Lets calculate the distance to ID 3:
Which label should we jump to within ID3 from ID1?
Among {5, 10, 11}, the number closest to 7 would be '5' (distance = 7-5 = 2)

Since distance to ID2 is smaller than ID3, we should jump to ID2, label 8.

Now that we are at label 8, we can directly jump to label 9 in ID4 (distance = 9-8 = 1)

Total distance = 1 + min(1, 2) + 1 = 3

The general algorithm would be:

  • Create a hashmap between ID and a sorted list of labels who share that ID.
  • If source label is L1 and destination label is L2, assuming they belong to ids I1 and I2, perform dijkstra to find shortest distance between ids I1 and I2.
  • While performing dijkstra, compute the 'distance' between current label 'L1' and a neighbour id by choosing the label which is closest in value to L1 among the list of labels sharing that id (this list can be accessed from the hashmap).
  • For all such distances from L1 to the neighbour IDs, push all these distances (along with the chosen label) to a priority queue (same as normal dijkstra).

Finding a number closest to a value in a sorted array can be done in O(logn), where n is the size of the sorted array.

The algorithmic complexity for dijkstra would be O(M^2 * logM * logX), where X = average number of labels sharing a single ID, which can be safely assumed to be N / M.

Creating hashmap with sorted lists would take O(M*X*logX) = O(N * logN/M) complexity.

So total complexity = O((N * log(N/M)) + (M^2 * logM * log(N/M)) ~ O(log(N/M) * (N + M^2)

like image 111
Anmol Singh Jaggi Avatar answered Nov 21 '25 15:11

Anmol Singh Jaggi


Disclaimer:
As I understand your formulation, edge ID1 -> ID2 means that from a node with ID1 there is an edge to every other node with ID2. If this is not correct, simply ignore the rest. For now I will assume this to be the truth.

Observation:
Since the edge ID1->ID2 means that every node with ID1 has an edge to every other node with ID2 it doesn't matter which node with certain ID we use - they are all equivalent for the shortest path purpose.
This means that to compute the length of the shortest path from a node Label_X, ID_I to a node Label_Y, ID_J we can simply compute the Djikstra over a graph that considers every label present exactly once to find a shortest path over labels. This will take O(M * log(M^2)) operations.

Now, since we have the path of labels we can simply select a first node, with the required label, that we encountered. Since we have to read the input anyways, we can simply create an array with a first node of every ID that we encounter. The reading of the input will take ~ N + M^2 operations
The total complexity in this case would be O(N + M^2 + M * log(M^2)). In your case the total would be 50000 + 2500 + 50 * log(2500) = ~ 53100.

like image 33
Shamis Avatar answered Nov 21 '25 15:11

Shamis



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!