I'm new to the R programming and I'm involved in representing graphs using R. I would like to ask about how I implement a code that can find all paths between two vertices or nodes based on an adjacency matrix. I've seen many implementations in other programming languages but most of them used queues as in (BFS) to make them work. For example this is the edge list of my graph.
[,1] [,2]
[1,] 0 1
[2,] 1 2
[3,] 1 3
[4,] 1 4
[5,] 2 5
[6,] 2 6
[7,] 5 7
[8,] 5 8
[9,] 6 9
[10,] 6 10
[11,] 8 11
[12,] 10 12
[13,] 11 13
[14,] 11 14
[15,] 11 15
[16,] 12 16
[17,] 12 17
[18,] 12 18
[19,] 13 19
[20,] 16 20
[21,] 19 21
[22,] 19 22
[23,] 20 22
[24,] 20 23
If I wanted all paths between node 0 and node 22, they should be two paths:
[[1]]
[1] 0 1 2 6 10 12 16 20 22
[[2]]
[1] 0 1 2 5 8 11 13 19 22
Thanks
I have used the following code to create a matrix (vertices x vertices) that contains the number of all paths between every two vertices.
library(igraph)
setwd("C:/Workspace")
graph <- read.graph("graph.txt", format="edgelist")
direct <- get.adjacency(graph)
indirect <- direct
max <- vcount(graph)-1
for(i in 0:max)
for(j in 0:max)
indirect[i,j] <- length(get.all.shortest.paths(graph, from=i, to=j, mode="out"))
I propose to use the igraph library for this purpose.
library(igraph)
I have put your edge list into a file called "graph.txt" which i put into "C:\workspace". Then I use the following code to read-in that file in R:
setwd("C:/Workspace")
graph <- read.graph("graph.txt", format="edgelist")
You might want to plot the graph just to be sure that everything 's alright (however, this step can be left away):
plot(graph, layout=layout.fruchterman.reingold.grid)
I create a adjacency-matrix to see all direct links between the vertices:
direct <- get.adjacency(graph)
Then I create a dummy matrix called "indirect" wich is a copy of the adjancency matrix. I just need this matrix to later fill it with the indirect values:
indirect <- direct
Finally, I iterate over the whole graph to find the number of all indirect connections between every two vertices. I put this number into the indirect-matrix that I have just created before (additionally I have created a clause putting all values on the diagonal zero). Mode "out" ensures that only outgoing paths are counted. This can also be set to "in" or "total":
max <- vcount(graph)-1
for(i in 0:max)
for(j in 0:max)
indirect[i,j] <- length(get.all.shortest.paths(graph, from=i, to=j, mode="out"))
Assuming that you have a simple directed acyclic graph (DAG), the following approach will work for counting:
(A^n)_ij
gives you the number of paths of length n
between nodes i
and j
. Therefore you need to compute A + A^2 + ... + A^n + ...
to get the total number of paths between any two nodes. It is essential that you work with a DAG, as this guarantees that for large enough n
, A^n = 0
. Then the result can be written as A . (I - A)^(-1)
where I
is the identity matrix.
EDIT:
I don't really know R so I can only give you some pseudocode or explanations.
First, let's find the set of nodes reachable from node i
. Let's define vector v
to contain only zeros except at the i
th position where it contains 1. E.g. for the 1st node you'll have
v = (1,0,0, ..., 0)
Now let v_(n+1) = sign(v_n + A . v_n)
, where the purpose of the sign()
function is to replace nonzero elements by 1 and keep zeros 0. Do this iteration until you reach the fixed point, and you'll have a vector v
with nonzero elements at the positions corresponding to the nodes reachable from node i
.
If instead of the vector v
you start with the identity matrix (of the same size as A
), you'll get the reachable nodes for each other node in one go.
Now you have the set of reachable nodes for any starting node. Similarly you can get the list of nodes from which any target node is reachable (just reverse the directed edges, i.e. transpose A
)
Next, let's traverse the graph and find all paths you need.
This recursive function should do it (pseudocode):
traverse( path-so-far, target ):
let S = the last element of path-so-far
if S == target:
output path-so-far
return
let N = the set of nodes reachable from S in one step
remove all nodes from N from which the target is not reachable
for each K in N:
traverse( append(path-so-far, K), target )
path-so-far
is the list of nodes already visited; target
is the target node.
For a given pair of start and target nodes, just do traverse( {start}, target )
.
Note that the step where we remove all nodes from which the target is not reachable is only there to speed up the traversal, and don't enter "blind alleys"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With