Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sankey Diagrams in R?

I am trying to visualize my data flow with a Sankey Diagram in R.

I found this blog post linking to an R script that produces a Sankey Diagram; unfortunately, it's quite raw and somewhat limited (see below for sample code and data).

Does anyone know of other scripts—or maybe even a package—that is more developed? My end goal is to visualize both data flow and percentages by relative size of diagram components, like in these examples of Sankey Diagrams.

I posted a somewhat similar question on the r-help list, but after two weeks without any responses I'm trying my luck here on stackoverflow.

Thanks, Eric

PS. I'm aware of the Parallel Sets Plot, but that is not what I'm looking for.

# thanks to, https://tonybreyal.wordpress.com/2011/11/24/source_https-sourcing-an-r-script-from-github/   sourc.https     <- function(url, ...) { # install and load the RCurl package  if (match('RCurl', nomatch=0, installed.packages()[,1])==0) {   install.packages(c("RCurl"), dependencies = TRUE)   require(RCurl)   } else require(RCurl)      # parse and evaluate each .R script   sapply(c(url, ...), function(u) {     eval(parse(text = getURL(u, followlocation = TRUE,      cainfo  = system.file("CurlSSL", "cacert.pem",      package = "RCurl"))), envir = .GlobalEnv)  } )  }  # from https://gist.github.com/1423501 sourc.https("https://raw.github.com/gist/1423501/55b3c6f11e4918cb6264492528b1ad01c429e581/Sankey.R")  # My example (there is another example inside Sankey.R): inputs = c(6, 144) losses = c(6,47,14,7, 7, 35, 34) unit = "n ="  labels = c("Transfers",            "Referrals\n",            "Unable to Engage",            "Consultation only",            "Did not complete the intake",            "Did not engage in Treatment",            "Discontinued Mid-Treatment",            "Completed Treatment",            "Active in \nTreatment")  SankeyR(inputs,losses,unit,labels)  # Clean up my mess rm("inputs", "labels", "losses", "SankeyR", "sourc.https", "unit") 

Sankey Diagram produced with the above code, Sankey Diagram produced with the code above

like image 465
Eric Fail Avatar asked Apr 01 '12 21:04

Eric Fail


People also ask

What is Sankey diagram example?

A sankey diagram is a visualization used to depict a flow from one set of values to another. The things being connected are called nodes and the connections are called links.

What is the difference between Sankey and alluvial?

Alluvial Diagrams are mainly applied in multi-dimensional data analysis cases. Emphasis mainly focuses on the frequencies and proportions between different dimensions and how they relate to each other. Sankey Diagrams are applied in cases that require quantity visualization between various stages of the entire process.


2 Answers

This plot can be created through the networkD3 package. It allows you to create interactive sankey diagrams. Here you can find an example. I also added a screenshot so you have an idea what it looks like.

# Load package library(networkD3)  # Load energy projection data # Load energy projection data URL <- paste0(         "https://cdn.rawgit.com/christophergandrud/networkD3/",         "master/JSONdata/energy.json") Energy <- jsonlite::fromJSON(URL) # Plot sankeyNetwork(Links = Energy$links, Nodes = Energy$nodes, Source = "source",              Target = "target", Value = "value", NodeID = "name",              units = "TWh", fontSize = 12, nodeWidth = 30) 

enter image description here

like image 58
Jonas Tundo Avatar answered Oct 09 '22 07:10

Jonas Tundo


I have created a package (riverplot) that has a slightly different, but overlapping functionality compared to the Sankey function, and can produce plots like this one:

enter image description here

like image 34
January Avatar answered Oct 09 '22 08:10

January