I'd like to focus on the flow highlighted above connecting the blue 'Thermal generation' block to the pink 'Electricity grid' block. You'll notice that the flow is 526 TWh, which is row #62 from Energy$links
.
Energy$links
source target value
...
62 26 15 525.531
...
Now let's focus on the source
and target
values which refer to nodes in Energy$nodes
.
Energy$nodes
name
...
15 Heating and cooling - homes
16 Electricity grid
...
26 Gas reserves
27 Thermal generation
...
The source
value is '26' when it actually refers to row '27' of the nodes data. The target value is '15' when it actually refers to row '16' of the nodes data. Why do the source and target values in the links data actually refer to row x - 1 instead of x in the nodes data? Is there any way around this other than performing the x - 1 calculation in my head when building these Sankey Diagrams?
Here's the full Energy
data:
> Energy
$`nodes`
name
1 Agricultural 'waste'
2 Bio-conversion
3 Liquid
4 Losses
5 Solid
6 Gas
7 Biofuel imports
8 Biomass imports
9 Coal imports
10 Coal
11 Coal reserves
12 District heating
13 Industry
14 Heating and cooling - commercial
15 Heating and cooling - homes
16 Electricity grid
17 Over generation / exports
18 H2 conversion
19 Road transport
20 Agriculture
21 Rail transport
22 Lighting & appliances - commercial
23 Lighting & appliances - homes
24 Gas imports
25 Ngas
26 Gas reserves
27 Thermal generation
28 Geothermal
29 H2
30 Hydro
31 International shipping
32 Domestic aviation
33 International aviation
34 National navigation
35 Marine algae
36 Nuclear
37 Oil imports
38 Oil
39 Oil reserves
40 Other waste
41 Pumped heat
42 Solar PV
43 Solar Thermal
44 Solar
45 Tidal
46 UK land based bioenergy
47 Wave
48 Wind
$links
source target value
1 0 1 124.729
2 1 2 0.597
3 1 3 26.862
4 1 4 280.322
5 1 5 81.144
6 6 2 35.000
7 7 4 35.000
8 8 9 11.606
9 10 9 63.965
10 9 4 75.571
11 11 12 10.639
12 11 13 22.505
13 11 14 46.184
14 15 16 104.453
15 15 14 113.726
16 15 17 27.140
17 15 12 342.165
18 15 18 37.797
19 15 19 4.412
20 15 13 40.858
21 15 3 56.691
22 15 20 7.863
23 15 21 90.008
24 15 22 93.494
25 23 24 40.719
26 25 24 82.233
27 5 13 0.129
28 5 3 1.401
29 5 26 151.891
30 5 19 2.096
31 5 12 48.580
32 27 15 7.013
33 17 28 20.897
34 17 3 6.242
35 28 18 20.897
36 29 15 6.995
37 2 12 121.066
38 2 30 128.690
39 2 18 135.835
40 2 31 14.458
41 2 32 206.267
42 2 19 3.640
43 2 33 33.218
44 2 20 4.413
45 34 1 4.375
46 24 5 122.952
47 35 26 839.978
48 36 37 504.287
49 38 37 107.703
50 37 2 611.990
51 39 4 56.587
52 39 1 77.810
53 40 14 193.026
54 40 13 70.672
55 41 15 59.901
56 42 14 19.263
57 43 42 19.263
58 43 41 59.901
59 4 19 0.882
60 4 26 400.120
61 4 12 46.477
62 26 15 525.531 # the highlighted 'flow'
63 26 3 787.129
64 26 11 79.329
65 44 15 9.452
66 45 1 182.010
67 46 15 19.013
68 47 15 289.366
In R, the networkD3 package is the best way to build them The networkD3 package allows to visualize networks using several kinds of viz. One of its function makes stunning Sankey diagrams as shown below. Follow the steps below to get the basics and learn how to customize your Sankey Diagram.
Row and column names are node names. The item in row x and column y represents the flow between x and y. In the Sankey diagram we represent all flows that are over 0. Since the networkD3 library expects a connection data frame, we will fist convert the dataset, and then re-use the code from above.
An incidence matrix is square or rectangle. Row and column names are node names. The item in row x and column y represents the flow between x and y. In the Sankey diagram we represent all flows that are over 0. Since the networkD3 library expects a connection data frame, we will fist convert the dataset, and then re-use the code from above.
That being said, networkD3 is not designed to facilitate that level of customization. In order to achieve that, one would have to heavily modify the underlying JavaScript that is included in the package. Thanks cjyetman, what other packages could you recommend for the sankey diagram?
The reason is that ultimately the data gets sent to JavaScript/D3, which uses 0-based indexing... which means the index of the first element of a vector/array/etc. is 0
... unlike in R where the index of the first element of a vector is 1
.
as an example of easily converting R-style data...
source <- c("A", "A", "B", "C", "D", "D", "E", "E")
target <- c("D", "E", "E", "D", "H", "I", "I", "H")
nodes <- data.frame(name = unique(c(source, target)))
links <- data.frame(source = match(source, nodes$name) - 1,
target = match(target, nodes$name) - 1,
value = 1)
library(networkD3)
sankeyNetwork(links, nodes, "source", "target", "value", "name")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With