Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Route problem in a graph: minimize average edge cost instead of total cost

I have a weighted graph, no negative weights, and I would like to find the path from one node to another, trying to minimize the cost for the single step. I don't need to minimize the total cost of the trip (as e.g. Dijkstra does) but the average step-cost. However, I have a constraint: K, the maximum number of nodes in the path.

So for example to go from A to J maybe Dijkstra would find this path (between parenthesis the weight)

A (4) D (6) J -> total cost: 10

and the algorithm I need, setting K = 10, would find something like

A (1) B (2) C (2) D (1) E (3) F (2) G (1) H (3) J -> total cost: 15

Is there any well known algorithm for this problem?

Thanks in advance.

Eugenio

Edit as answer to templatetypedef. Some questions:

1) The fact that it can happen to take a cycle multiple times to drive down the average is not good for my problem: maybe I should have mentioned it but I don' want to visit the same node more than once

2) Is it possible to exploit the fact that I don't have negative weights?

3) When you said O(kE) you meant for the whole algorithm or just for the additional part?

Let's take this simple implementation in C where n=number of nodes e=number of edges, d is a vector with the distances, p a vector with the predecessor and a structure edges (u,v,w) memorize the edges in the graphs

for (i = 0; i < n; ++i)
    d[i] = INFINITY;

    d[s] = 0;

    for (i = 0; i < n - 1; ++i)
        for (j = 0; j < e; ++j)
            if (d[edges[j].u] + edges[j].w < d[edges[j].v]){
                d[edges[j].v] = d[edges[j].u] + edges[j].w;
                p[edges[j].v] = u;
                }

I'm not sure how I should modify the code according to your answer; to take into consideration the average instead of the total cost should this be enough?

for (i = 0; i < n; ++i)
        d[i] = INFINITY;

    d[s] = 0;

    for (i = 0; i < n - 1; ++i)
        steps = 0;
        for (j = 0; j < e; ++j)
            if ( (d[edges[j].u]+ edges[j].w)/(steps+1) < d[edges[j].v]/steps){
                d[edges[j].v] = d[edges[j].u] + edges[j].w;
                p[edges[j].v] = u;
                steps++;
            }

But anyway I don't know how take into consideration the K limit at the same time...Thanks again in advance for your help.

Edit Since I can afford some errors I'm thinking about this naif solution:

  • precompute all the shortest paths and memorize in A
  • precompute all the shortest paths on a modified graph, where I cut the edges over a certain weight and memorize them in B

When I need a path, I look in A, e.g. from x to y this is the path x->z->y then for each step I look in B, so for x > z I see if there is a connection in B, if not I keep x > z otherwise I fill the path x > z with the subpath provided by B, that could be something like x->j->h->z; then I do the same for z->y. Each time I will also check if I'm adding a cyclic path.

Maybe I will get some weird paths but it could work in most of the case. If I extend the solution trying with different "cut thresholds" maybe I can also be close to respect the K constrain.

like image 992
Eugenio Avatar asked Aug 25 '11 18:08

Eugenio


People also ask

What is cost matrix of Dijkstra?

Dijkstra's Algorithm. Dijkstra's algorithm finds a least cost path between two nodes. The cost of a path between node n1 and node n2 is the sum of the costs of the edges on that path. The algorithm requires that costs always be positive, so there is no benefit in passing through a node more than once.

Does Dijkstra guarantee shortest path?

Yes Dijkstra's always gives shortest path when the edge costs are all positive. However, it can fail when there are negative edge costs.

Do we remove parallel edges in Dijkstra algorithm?

Out of the parallel edges you can throw away all but the one with the lowest weight. The reasoning for this is equally simple: if there was a shortest path going through an edge A that has a parallel edge B with lower weight, you could construct an even shorter path by simply replacing A with B .


2 Answers

I believe that you can solve this using a modified version of the Bellman-Ford algorithm.

Bellman-Ford is based on the following dynamic programming recurrence that tries to find the shortest path from some start node s to each other node that's of length no greater than m for some m. As a base case, when you consider paths of length zero, the only reachable node is s and the initial values are

BF(s, t, 0) = infinity
BF(s, s, 0) = 0

Then, if we know the values for a path of length m, we can find it for paths of length m + 1 by noting that the old path may still be valid, or we want to extend some path by length one:

BF(s, t, m + 1) = min {
                     BF(s, t, m),
                     BF(s, u, m) + d(u, t) for any node u connected to t
                   }

The algorithm as a whole works by noting that any shortest path must have length no greater than n and then using the above recurrence and dynamic programming to compute the value of BF(s, t, n) for all t. Its overall runtime is O(EV), since there are E edges to consider at each step and V total vertices.

Let's see how we can change this algorithm to solve your problem. First, to limit this to paths of length k, we can just cut off the Bellman-Ford iteration after finding all shortest paths of length up to k. To find the path with lowest average cost is a bit trickier. At each point, we'll track two quantities - the length of the shortest path reaching a node t and the average length of that path. When considering new paths that can reach t, our options are to either keep the earlier path we found (whose cost is given by the shortest path so far divided by the number of nodes in it) or to extend some other path by one step. The new cost of that path is then given by the total cost from before plus the edge length divided by the number of edges in the old path plus one. If we take the cheapest of these and then record both its cost and number of edges, at the end we will have computed the path with lowest average cost of length no greater than k in time O(kE). As an initialization, we will say that the path from the start node to itself has length 0 and average cost 0 (the average cost doesn't matter, since whenever we multiply it by the number of edges we get 0). We will also say that every other node is at distance infinity by saying that the average cost of an edge is infinity and that the number of edges is one. That way, if we ever try computing the cost of a path formed by extending the path, it will appear to have average cost infinity and won't be chosen.

Mathematically, the solution looks like this. At each point we store the average edge cost and the total number of edges at each node:

BF(s, t, 0).edges = 1
BF(s, t, 0).cost  = infinity

BF(s, s, 0).edges = 0
BF(s, s, 0).cost  = 0

BF(s, t, m + 1).cost = min {
    BF(s, t, m).cost,
    (BF(s, u, m).cost * BF(s, u, m).edges + d(u, t)) / (BF(s, u, m).edges + 1)
}

BF(s, t, m + 1).edges = {
    BF(s, t, m).edges         if you chose the first option above. 
    BF(s, u, m).edges + 1     else, where u is as above
}

Note that this may not find a simple path of length k, since minimizing the average cost might require you to take a cycle with low (positive or negative) cost multiple times to drive down the average. For example, if a graph has a cost-zero loop, you should just keep taking it as many times as you can.

EDIT: In response to your new questions, this approach won't work if you don't want to duplicate nodes on a path. As @comestibles has pointed out, this version of the problem is NP-hard, so unless P = NP you shouldn't expect to find any good polynomial-time algorithm for this problem.

As for the runtime, the algorithm I've described above runs in total time O(kE). This is because each iteration of computing the recurrence takes O(E) time and there are a total of k iterations.

Finally, let's look at your proposed code. I've reprinted it here:

for (i = 0; i < n - 1; ++i) {
    steps = 0;
    for (j = 0; j < e; ++j) {
        if ( (d[edges[j].u]+ edges[j].w)/(steps+1) < d[edges[j].v]/steps){
            d[edges[j].v] = d[edges[j].u] + edges[j].w;
            p[edges[j].v] = u;
            steps++;
        }
    }
}

Your first question was how to take k into account. This can be done easily by rewriting the outer loop to count up to k, not n - 1. That gives us this code:

for (i = 0; i < k; ++i) {
    steps = 0;
    for (j = 0; j < e; ++j) {
        if ( (d[edges[j].u]+ edges[j].w)/(steps+1) < d[edges[j].v]/steps){
            d[edges[j].v] = d[edges[j].u] + edges[j].w;
            p[edges[j].v] = u;
            steps++;
        }
    }
}

One problem that I'm noticing is that the modified Bellman-Ford algorithm needs to have each candidate best path store its number of edges independently, since each node's optimal path might be reached by a different number of edges. To fix this, I would suggest having the d array store two values - the number of edges required to reach the node and the average cost of a node along that path. You would then update your code by replacing the steps variable in these equations with the cached path lengths.

Hope this helps!

like image 174
templatetypedef Avatar answered Sep 20 '22 09:09

templatetypedef


For the new version of your problem, there's a reduction from Hamilton path (making your problem intractable). Take an instance of Hamilton path (i.e., a graph whose edges are assumed to have unit weight), add source and sink vertices and edges of weight 2 from the source to all others and from the sink to all others. Set K = |V| + 2 and request a path from source to sink. There exists a Hamilton path if and only if the optimal mean edge length is (|V| + 3)/(|V| + 2).

Care to tell us why you want these paths so that we can advise you of a reasonable approximation strategy?

like image 30
comestibles Avatar answered Sep 18 '22 09:09

comestibles