Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fixing the order of a Sankey flow graph in R / networkD3 package

I would like to visualize a rank change (i.e. change of the relative order) of US states using a Sankey flow graph. I'm using the networkd3 package and came up with the following:

library(dplyr)
library(networkD3)

df <- data_frame(origins=state.name[1:10], destinations=state.name[1:10])
lab <- c(df$origins, df$destinations)
nodes <- data.frame(node=c(0:9), name=lab)
links <- data.frame(source=c(0:9), target=c(10:19), value=rep(1,10))
sankeyNetwork(Links = links, Nodes = nodes, Source = 'source', 
              Target = 'target', Value = 'value', NodeID = 'name')

This snippet produces the following graph: Sankey flow graph of US states enter image description here

I can change the relative order by hand now. However, I wonder whether it is possible to fix the order on the right-hand side and put e.g. Alabama on rank 3, California on rank 1, etc ...

like image 841
Ben Avatar asked Sep 07 '18 20:09

Ben


People also ask

How do I change the order of my Sankey diagram?

You can simply drag and drop the bars in the desired sorting order.

How do you analyze a Sankey diagram?

The key to reading and interpreting Sankey Diagrams is remembering that the width is proportional to the quantity represented. In the example below, the audience quickly sees that largest destination for water is terrestrial evaporation, among other features of the hydrologic cycle.

What makes a good Sankey diagram?

Summary: As a visualization, Sankey diagrams are generally hard for the user to understand. If you must use them, two principles I recommend are to: 1) Keep it simple - no more than 2-3 nodes, 7-8 flows; 2) Interactivity - if you insist on a complex diagram, it should interactive and used for data exploraiton.

What kind of data do you need for a Sankey diagram?

A Sankey chart requires three data columns or rows from your DataSet—one for the "From" column, one for the "To" column, and one for the values corresponding to each pairing.


1 Answers

If you set iterations = 0 in sankeyNetwork(), you will effectively disable the algorithm which automatically determines the node placement (which is the primary purpose of the sankeyNetwork() function), and the nodes will be placed in the order that they appear in the Nodes dataframe.

library(networkD3)

states <- state.name[1:10]
ranks <- sample.int(10)

nodes <- data.frame(name = c(states, states[ranks]))
links <- data.frame(source = 1:10 - 1, target = order(ranks) + 10 - 1, value = 1)

sankeyNetwork(Links = links, Nodes = nodes, Source = 'source', 
              Target = 'target', Value = 'value', NodeID = 'name',
              iterations = 0)

enter image description here

like image 129
CJ Yetman Avatar answered Sep 22 '22 10:09

CJ Yetman