I've made a Sankey diagram in R Riverplot (v0.5), the output looks OK small in RStudio, but when exported or zoomed in it the colours have dark outlines or gridlines.
I think it may be because the outlines of the shapes are not matching the transparency I want to use for the fill?
I possibly need to find a way to get rid of outlines altogether (rather than make them semi-transparent), as I think they're also the reason why flows with a value of zero still show up as thin lines.
my code is here:
#loading packages
library(readr)
library("riverplot", lib.loc="C:/Program Files/R/R-3.3.2/library")
library(RColorBrewer)
#loaing data
Cambs_flows <- read_csv("~/RProjects/Cambs_flows4.csv")
#defining the edges
edges = rep(Cambs_flows, col.names = c("N1","N2","Value"))
edges <- data.frame(edges)
edges$ID <- 1:25
#defining the nodes
nodes <- data.frame(ID = c("Cambridge","S Cambs","Rest of E","Rest of UK","Abroad","to Cambridge","to S Cambs","to Rest of E","to Rest of UK","to Abroad"))
nodes$x = c(1,1,1,1,1,2,2,2,2,2)
nodes$y = c(1,2,3,4,5,1,2,3,4,5)
#picking colours
palette = paste0(brewer.pal(5, "Set1"), "90")
#plot styles
styles = lapply(nodes$y, function(n) {
list(col = palette[n], lty = 0, textcol = "black")
})
#matching nodes to names
names(styles) = nodes$ID
#defining the river
r <- makeRiver( nodes, edges,
node_labels = c("Cambridge","S Cambs","Rest of E","Rest of UK","Abroad","to Cambridge","to S Cambs","to Rest of E","to Rest of UK","to Abroad"),
node_styles = styles)
#Plotting
plot( r, plot_area = 0.9)
And my data is here
dput(Cambs_flows)
structure(list(N1 = c("Cambridge", "Cambridge", "Cambridge",
"Cambridge", "Cambridge", "S Cambs", "S Cambs", "S Cambs", "S Cambs",
"S Cambs", "Rest of E", "Rest of E", "Rest of E", "Rest of E",
"Rest of E", "Rest of UK", "Rest of UK", "Rest of UK", "Rest of UK",
"Rest of UK", "Abroad", "Abroad", "Abroad", "Abroad", "Abroad"
), N2 = c("to Cambridge", "to S Cambs", "to Rest of E", "to Rest of UK",
"to Abroad", "to Cambridge", "to S Cambs", "to Rest of E", "to Rest of UK",
"to Abroad", "to Cambridge", "to S Cambs", "to Rest of E", "to Rest of UK",
"to Abroad", "to Cambridge", "to S Cambs", "to Rest of E", "to Rest of UK",
"to Abroad", "to Cambridge", "to S Cambs", "to Rest of E", "to Rest of UK",
"to Abroad"), Value = c(0L, 1616L, 2779L, 13500L, 5670L, 2593L,
0L, 2975L, 4742L, 1641L, 2555L, 3433L, 0L, 0L, 0L, 6981L, 3802L,
0L, 0L, 0L, 5670L, 1641L, 0L, 0L, 0L)), class = c("tbl_df", "tbl",
"data.frame"), row.names = c(NA, -25L), .Names = c("N1", "N2",
"Value"), spec = structure(list(cols = structure(list(N1 = structure(list(), class = c("collector_character",
"collector")), N2 = structure(list(), class = c("collector_character",
"collector")), Value = structure(list(), class = c("collector_integer",
"collector"))), .Names = c("N1", "N2", "Value")), default = structure(list(), class = c("collector_guess",
"collector"))), .Names = c("cols", "default"), class = "col_spec"))
The culprit is a line in riverplot::curveseg
. We can hack this function to fix it, or there is also a very simple workaround that does not require hacking the function. In fact, the simple solution is probably preferably in many cases, but first I explain how to hack the function, so we understand why the workaround also works. Scroll to the end of this answer if you only want the simple solution:
UPDATE: The change suggested below has now been implemented in riverplot version 0.6
To edit the function, you can use
trace(curveseg, edit=T)
Then find the line near the end of the function that reads
polygon(c(xx[i], xx[i + 1], xx[i + 1], xx[i]), c(yy[i],
yy[i + 1], yy[i + 1] + w, yy[i] + w), col = grad[i],
border = grad[i])
We can see here that the package authors chose not to pass the lty
parameter to polygon
(UPDATE: see this answer for an explanation of why the package author did it this way). Change this line by adding lty = 0
(or, if you prefer, border = NA
) and it works as intended for OPs case. (But note that this may not work well if you wish to render a pdf - see here)
polygon(c(xx[i], xx[i + 1], xx[i + 1], xx[i]), c(yy[i],
yy[i + 1], yy[i + 1] + w, yy[i] + w), col = grad[i],
border = grad[i], lty=0)
As a side note, this also explains the somewhat odd reported behaviour in the comments that "if you run it twice, the second time the plot looks OK, although export it and the lines come back". When lty
is not specified in a call to polygon
, the default value it uses is lty = par("lty")
. Initially, the default par("lty")
is a solid line, but after running the riverplot function once, par("lty")
gets set to 0 during a call to riverplot:::draw.nodes
thus, suppressing the lines when riverplot
is run a 2nd time. But if you then try to export the image, opening a new device resets par("lty")
to its default value.
An alternative way to update the function with this edit is to use assignInNamespace
to overwrite the package function with your own version. Like this:
curveseg.new = function (x0, x1, y0, y1, width = 1, nsteps = 50, col = "#ffcc0066",
grad = NULL, lty = 1, form = c("sin", "line"))
{
w <- width
if (!is.null(grad)) {
grad <- colorRampPaletteAlpha(grad)(nsteps)
}
else {
grad <- rep(col, nsteps)
}
form <- match.arg(form, c("sin", "line"))
if (form == "sin") {
xx <- seq(-pi/2, pi/2, length.out = nsteps)
yy <- y0 + (y1 - y0) * (sin(xx) + 1)/2
xx <- seq(x0, x1, length.out = nsteps)
}
if (form == "line") {
xx <- seq(x0, x1, length.out = nsteps)
yy <- seq(y0, y1, length.out = nsteps)
}
for (i in 1:(nsteps - 1)) {
polygon(c(xx[i], xx[i + 1], xx[i + 1], xx[i]),
c(yy[i], yy[i + 1], yy[i + 1] + w, yy[i] + w),
col = grad[i], border = grad[i], lty=0)
lines(c(xx[i], xx[i + 1]), c(yy[i], yy[i + 1]), lty = lty)
lines(c(xx[i], xx[i + 1]), c(yy[i] + w, yy[i + 1] + w), lty = lty)
}
}
assignInNamespace('curveseg', curveseg.new, 'riverplot', pos = -1, envir = as.environment(pos))
Now for the simple solution, which does not require changes to the function:
Just add the line par(lty=0)
before you plot!!!
Here is the author of the package. I am now struggling for a satisfactory solution to be included in the next version of the package.
The problem is with how R renders PDFs as compared to bitmaps. In the original version of the package, indeed I passed on lty=0 to polygon() (you can still see it in the commented source code). However, polygon w/o borders looks good only on the png graphics. In the pdf output, thin white lines appear between the polygons. Take a look:
cc <- "#E41A1C90"
plot.new()
rect(0.2, 0.2, 0.4, 0.4, col=cc, border=NA)
rect(0.4, 0.2, 0.6, 0.4, col=cc, border=NA)
dev.copy2pdf(file="riverplot.pdf")
In X or on png, the output is correct. However, if rendered as PDF, you will see a thin white line between the recangles:
When you render a riverplot graphics as PDF like the one above, this looks really bad:
I therefore forced adding borders, however forgot about checking transparency. When no transparency is used, this looks OK -- the borders overlap with the polygons as well as which each other, but you cannot see it. The PDF is now acceptable. However, it messes up the figure if you have transparency.
EDIT:
I have now uploaded version 0.6 of riverplot to CRAN. Besides some new stuff (you can now add riverplot to any part of an existing drawing), by default it uses lty=0 again. However, there is now an option called "fix.pdf" which you can set to TRUE in order to draw the borders around the segments again.
Bottom line, and solutions for now:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With