Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Minimum Cost Flow - network optimization in R

I am trying to implement a "Minimum Cost Network Flow" transportation problem solution in R.

I understand that this could be implemented from scratch using something like lpSolve. However, I see that there is a convenient igraph implementation for "Maximum Flow". Such a pre-existing solution would be a lot more convenient, but I can't find an equivalent function for Minimum Cost.

Is there an igraph function that calculates Minimum Cost Network Flow solutions, or is there a way to apply the igraph::max_flow function to a Minimum Cost problem?

igraph network example:

library(tidyverse)
library(igraph)

edgelist <- data.frame(
  from = c(1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 5, 5, 6, 6, 7, 8),
  to = c(2, 3, 4, 5, 6, 4, 5, 6, 7, 8, 7, 8, 7, 8, 9, 9),
  capacity = c(20, 30, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99),
  cost = c(0, 0, 1, 2, 3, 4, 3, 2, 3, 2, 3, 4, 5, 6, 0, 0))

g <- graph_from_edgelist(as.matrix(edgelist[,c('from','to')]))

E(g)$capacity <- edgelist$capacity
E(g)$cost <- edgelist$cost

plot(g, edge.label = E(g)$capacity)
plot(g, edge.label = E(g)$cost)

enter image description here enter image description here

This is a network with directional edges, a "source node" (1), and a "sink node" (9). Each edge has a "capacity" (here often put as 99 for unlimited) and a "cost" (the cost of one unit flowing through this edge). I want to find the integer vector of flows (x, length = 9) that minimises the cost while transmitting a pre-defined flow through the network (let's say 50 units, from node 1 into node 9).

Disclaimer: this post asked a similar question, but did not result in a satisfying answer and is rather dated (2012).

like image 322
JanLauGe Avatar asked Apr 25 '17 16:04

JanLauGe


People also ask

How do you calculate minimum flow network?

Turn the feasible flow into a minimum flow by solving a max flow problem. You need to find the maximum flow on the graph that has capacities equal to flow(e) - lower-bound(e), where flow(e) means flow from the feasible flow. This maximum flow subtracted from the feasible flow will be a minimum flow.

Which algorithm solves the minimal cost network flow problems?

The algorithm that the network solver uses for solving MCF is a variant of the primal network simplex algorithm (Ahuja, Magnanti, and Orlin 1993).

What is network flow optimization?

In combinatorial optimization, network flow problems are a class of computational problems in which the input is a flow network (a graph with numerical capacities on its edges), and the goal is to construct a flow, numerical values on each edge that respect the capacity constraints and that have incoming flow equal to ...


2 Answers

In case of interest, here is how I ended up solving this problem. I used an edge dataframe with to nodes, from nodes, cost property and capacity property for each edge to create a constraint matrix. Subsequently, I fed this into a linear optimisation using the lpSolve package. Step-by-step below.


Start with the edgelist dataframe from my example above

library(magrittr)

# Add edge ID
edgelist$ID <- seq(1, nrow(edgelist))

glimpse(edgelist)

Looks like this:

Observations: 16
Variables: 4
$ from     <dbl> 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 5, 5, 6, 6, 7, 8
$ to       <dbl> 2, 3, 4, 5, 6, 4, 5, 6, 7, 8, 7, 8, 7, 8, 9, 9
$ capacity <dbl> 20, 30, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99
$ cost     <dbl> 0, 0, 1, 2, 3, 4, 3, 2, 3, 2, 3, 4, 5, 6, 0, 0
$ ID       <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16

Create constraints matrix

createConstraintsMatrix <- function(edges, total_flow) {

  # Edge IDs to be used as names
  names_edges <- edges$ID
  # Number of edges
  numberof_edges <- length(names_edges)

  # Node IDs to be used as names
  names_nodes <- c(edges$from, edges$to) %>% unique
  # Number of nodes
  numberof_nodes <- length(names_nodes)

  # Build constraints matrix
  constraints <- list(
    lhs = NA,
    dir = NA,
    rhs = NA)

  #' Build capacity constraints ------------------------------------------------
  #' Flow through each edge should not be larger than capacity.
  #' We create one constraint for each edge. All coefficients zero
  #' except the ones of the edge in question as one, with a constraint
  #' that the result is smaller than or equal to capacity of that edge.

  # Flow through individual edges
  constraints$lhs <- edges$ID %>%
    length %>%
    diag %>%
    set_colnames(edges$ID) %>%
    set_rownames(edges$ID)
  # should be smaller than or equal to
  constraints$dir <- rep('<=', times = nrow(edges))
  # than capacity
  constraints$rhs <- edges$capacity


  #' Build node flow constraints -----------------------------------------------
  #' For each node, find all edges that go to that node
  #' and all edges that go from that node. The sum of all inputs
  #' and all outputs should be zero. So we set inbound edge coefficients as 1
  #' and outbound coefficients as -1. In any viable solution the result should
  #' be equal to zero.

  nodeflow <- matrix(0,
                     nrow = numberof_nodes,
                     ncol = numberof_edges,
                     dimnames = list(names_nodes, names_edges))

  for (i in names_nodes) {

    # input arcs
    edges_in <- edges %>%
      filter(to == i) %>%
      select(ID) %>%
      unlist
    # output arcs
    edges_out <- edges %>%
      filter(from == i) %>%
      select(ID) %>%
      unlist

    # set input coefficients to 1
    nodeflow[
      rownames(nodeflow) == i,
      colnames(nodeflow) %in% edges_in] <- 1

    # set output coefficients to -1
    nodeflow[
      rownames(nodeflow) == i,
      colnames(nodeflow) %in% edges_out] <- -1
  }

  # But exclude source and target edges
  # as the zero-sum flow constraint does not apply to these!
  # Source node is assumed to be the one with the minimum ID number
  # Sink node is assumed to be the one with the maximum ID number
  sourcenode_id <- min(edges$from)
  targetnode_id <- max(edges$to)
  # Keep node flow values for separate step below
  nodeflow_source <- nodeflow[rownames(nodeflow) == sourcenode_id,]
  nodeflow_target <- nodeflow[rownames(nodeflow) == targetnode_id,]
  # Exclude them from node flow here
  nodeflow <- nodeflow[!rownames(nodeflow) %in% c(sourcenode_id, targetnode_id),]

  # Add nodeflow to the constraints list
  constraints$lhs <- rbind(constraints$lhs, nodeflow)
  constraints$dir <- c(constraints$dir, rep('==', times = nrow(nodeflow)))
  constraints$rhs <- c(constraints$rhs, rep(0, times = nrow(nodeflow)))


  #' Build initialisation constraints ------------------------------------------
  #' For the source and the target node, we want all outbound nodes and
  #' all inbound nodes to be equal to the sum of flow through the network
  #' respectively

  # Add initialisation to the constraints list
  constraints$lhs <- rbind(constraints$lhs,
                           source = nodeflow_source,
                           target = nodeflow_target)
  constraints$dir <- c(constraints$dir, rep('==', times = 2))
  # Flow should be negative for source, and positive for target
  constraints$rhs <- c(constraints$rhs, total_flow * -1, total_flow)

  return(constraints)
}

constraintsMatrix <- createConstraintsMatrix(edges, 30)

Should result in something like this

$lhs
1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16
1       1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
2       0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0
3       0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0
4       0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0
5       0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0
6       0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0
7       0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0
8       0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0
9       0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0
10      0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0
11      0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0
12      0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0
13      0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0
14      0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0
15      0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0
16      0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1
2       1  0 -1 -1 -1  0  0  0  0  0  0  0  0  0  0  0
3       0  1  0  0  0 -1 -1 -1  0  0  0  0  0  0  0  0
4       0  0  1  0  0  1  0  0 -1 -1  0  0  0  0  0  0
5       0  0  0  1  0  0  1  0  0  0 -1 -1  0  0  0  0
6       0  0  0  0  1  0  0  1  0  0  0  0 -1 -1  0  0
7       0  0  0  0  0  0  0  0  1  0  1  0  1  0 -1  0
8       0  0  0  0  0  0  0  0  0  1  0  1  0  1  0 -1
source -1 -1  0  0  0  0  0  0  0  0  0  0  0  0  0  0
target  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  1

$dir
[1] "<=" "<=" "<=" "<=" "<=" "<=" "<=" "<=" "<=" "<=" "<=" "<=" "<=" "<="
[15] "<=" "<=" "==" "==" "==" "==" "==" "==" "==" "==" "=="

$rhs
[1]  20  30  99  99  99  99  99  99  99  99  99  99  99  99  99  99   0
[18]   0   0   0   0   0   0 -30  30

Feed constraints to lpSolve for the ideal solution

library(lpSolve)

# Run lpSolve to find best solution
solution <- lp(
  direction = 'min',
  objective.in = edgelist$cost,
  const.mat = constraintsMatrix$lhs,
  const.dir = constraintsMatrix$dir,
  const.rhs = constraintsMatrix$rhs)

# Print vector of flow by edge
solution$solution

# Include solution in edge dataframe
edgelist$flow <- solution$solution

Now we can convert our edges to a graph object and plot the solution

library(igraph)

g <- edgelist %>%
  # igraph needs "from" and "to" fields in the first two colums
  select(from, to, ID, capacity, cost, flow) %>%
  # Make into graph object
  graph_from_data_frame()

# Get some colours in to visualise cost
E(g)$color[E(g)$cost == 0] <- 'royalblue'
E(g)$color[E(g)$cost == 1] <- 'yellowgreen'
E(g)$color[E(g)$cost == 2] <- 'gold'
E(g)$color[E(g)$cost >= 3] <- 'firebrick'

# Flow as edge size,
# cost as colour
plot(g, edge.width = E(g)$flow)

graph with flow of optimal solution

Hope it's interesting / useful :)

like image 112
JanLauGe Avatar answered Oct 09 '22 20:10

JanLauGe


I was looking for a function but didn't success. The original function calls another: res <- .Call("R_igraph_maxflow", graph, source - 1, target - 1, capacity, PACKAGE = "igraph")

And I don't know how to deal with it.

For the moment I inverted the cost path values in order to use the same function in oposite direction:

E2 <- E # create another table
E2[, 3] <- max(E2[, 3]) + 1 - E2[, 3] # invert values

E2
     from to capacity
[1,]    1  3        8
[2,]    3  4       10
[3,]    4  2        9
[4,]    1  5       10
[5,]    5  6        9
[6,]    6  2        1

g2 <- graph_from_data_frame(as.data.frame(E2)) # create 2nd graph

# Get maximum flow
m1 <- max_flow(g1, source=V(g1)["1"], target=V(g1)["2"])
m2 <- max_flow(g2, source=V(g2)["1"], target=V(g2)["2"])

m1$partition2 # Route on maximal cost
+ 4/6 vertices, named:
[1] 4 5 6 2

m2$partition2 # Route on minimal cost
+ 3/6 vertices, named:
[1] 3 4 2

I draw in paper the graph and my code agree with manual solution This method should be tested with real know values, as I mentioned in the comment

like image 37
gonzalez.ivan90 Avatar answered Oct 09 '22 19:10

gonzalez.ivan90