Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to set SetGlobalSpanCostCoefficient and the capacity parameter in AddDimension properly?

I'm using OR-Tool to solve a VRP problem. I have experimented a bit with the exemple problem in the doc and managed to write a functioning program, but, I do not understand the purpose of the SetGlobalSpanCostCoefficient and how to set it properly. According to this site, it is the coefficient between the global span cost and the difference between the max and min dimension value in all routes. So, is this global cost the sum of all the routes's costs and is it calculated from the 'capacity' parameter of the Dimension and used like a maximum capacity limiter.

The problem in my code is that somme vihecles are not used unless I tweak the capacity (maximum route distance) in the AddDimension function and that globalSpanCostcoefficient manually. I have a 1000 nodes : With a distribution of distances (in meters) that looks likes this :


    # Add Distance constraint.
    distance_dimension_name = 'Distance'
    routing.AddDimension(
        transit_callback_index,
        0,  # no slack
        25000,  # vehicle maximum travel distance
        True,  # start cumul to zero
        distance_dimension_name)
    distance_dimension = routing.GetDimensionOrDie(distance_dimension_name)
    distance_dimension.SetGlobalSpanCostCoefficient(100)

Here, I get a max route distance of 2610m with 5 and 6 vehicles and it clusters in two routes. I have tried to add a counting dimension like it is described here but it becomes too slow even for a 100 nodes and the results are the same with 5 vehicles.

like image 662
Jugurtha Avatar asked Oct 07 '19 07:10

Jugurtha


2 Answers

The documentation is so cryptic it's very hard to actually understand what's going on. There is a lot of jargon but it's never defined clearly (formally). But here is my guess based on what I imagine it does (based on pure guess and experimentations).

First I think this tool uses linear programming (solver). Which may help in understanding what's going on.

From what I understand I think this function tweaks (configure) the objective function (the function the linear program solver is trying to minimise). In problems like this one, usually you want to minimise the total routes length (sum of all vehicles routes lengths).

If you only try to minimise this, then without any further constraint, the best solution to the problem is a single tour (one single vehicle visit each node once).

One typical constraint that is added to such problems is to limit the maximum route length (this constraint make it impossible for a single vehicle to visit all nodes in a single tour). I think then, you have several solutions, for example: 1) increase the number of vehicles, 2) allow for vehicle(s) to refuel (energy constrained), …

But I think in that case the objective function is more complex that just minimising the sum of all vehicles routes lengths, I think they also minimise the ratio longest route/shorter route.

Since its a two dimensional problem there are several optimal solutions. Each solution belonging to a Pareto frontier of the 2D solution space. Each point in the solution space simply being one solution, ie., a (sum of routes lengths, longest route/shorter route ratio) pair. Here is what it looks like (ref):

Pareto Frontier image from wikipedia

I think this tool doesn't look for all these solutions (because solving multi-criterium problems is quite hard, even more so when finding a single solution-point is already hard). Instead I think they try to find the closest solution to a given affine function, something like this:

Pareto Frontier image from wikipedia with

The slope of the function says "how much you value one criteria over the other". In that case since we want to minimise both the maximum route length and the total route length I guess this is what this configuration function does. It sets the slope.

On the other hand, giving the description from the documentation:

Sets a cost proportional to the global dimension span, that is the difference between the largest value of route end cumul variables and the smallest value of route start cumul variables. In other words: global_span_cost = coefficient * (Max(dimension end value) - Min(dimension start value)).

It's rather unclear that this matches what I described before. Because from what I understand global_span_cost is not the sum of routes lengths but, as they say, it's the "the global span of the routes, which in this example is the maximum of the distances of the routes". So I guess it's the length of the longest route amongst all vehicles routes. I can't figure out what "end value" and "start value" are. So again, all this is based on assumptions …

So maybe, the slope coefficient is fixed (to some constant, 1?) and this coefficient instead influences the solution point mappings on one of the two axis (its coordinates in the 2D solution space). For example, if you have two solutions:

  • a) max route length = 10, sum of routes lengths = 63
  • b) max route length = 8, sum of routes lengths = 90

(Note that both are impossible to compare, each one is better than the other on one criteria, they are both "optimal").

They might normalise each dimension:

  • a) max route length = 20 / 20, sum of routes lengths = 63 / 90
  • b) max route length = 16 / 20, sum of routes lengths = 90 / 90

Which gives:

  • a) max route length = 1, sum of routes lengths = .7
  • b) max route length = .8, sum of routes lengths = 1

Using these two solutions, if the affine function used to distinguish between incomparable solutions is y = x, solution a) is slightly better than solution b). Since x = y both criterium have the same importance in the selection, so solution 1 = 1.7 and solution 2 = 1.8 (a simple sum over both dimensions "sum." and "max.").

By changing the slope, we basically give the first criteria more importance over the second, which is equivalent to multiplying the first criteria by a given factor, say, for example, 3:

  • a) max route length = 3, sum of routes lengths = .7
  • b) max route length = 2.4, sum of routes lengths = 1

Gives solution a = 3.7 and solution b = 3.4, solution b) is now better than solution a).

Again, I'm only writing down what I'm guessing since none of this seems to be documented (or maybe it's just hard to find?). Also I still have no idea how to choose that coefficient according to your preferences/needs.

like image 135
cglacet Avatar answered Oct 06 '22 13:10

cglacet


The span cost is meant to measure idle time.

See: the SetGlobalSpanCostCoefficient doc entry

It differs from the dimension arc cost as (1) it incorporates slacks at nodes, and (2) it does not count initial waiting time at the depot.

In your example, you both forbid slacks, and force the first cumul to be 0. Thus the SpanCost is useless.

like image 41
Laurent Perron Avatar answered Oct 06 '22 11:10

Laurent Perron