Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ggplot aes_string doesn't work with spaces

Tags:

r

ggplot2

Doesn't work:

mydat <- data.frame(`Col 1`=1:5, `Col 2`=1:5, check.names=F)
xcol <- "Col 1"
ycol <- "Col 2"
ggplot(data=mydat, aes_string(x=xcol, y=ycol)) + geom_point()

Works:

mydat <- data.frame(`A`=1:5, `B`=1:5)
xcol <- "A"
ycol <- "B"
ggplot(data=mydat, aes_string(x=xcol, y=ycol)) + geom_point()

Works.

mydat <- data.frame(`Col 1`=1:5, `Col 2`=1:5, check.names=F)
ggplot(data=mydat, aes(x=`Col 1`, y=`Col 2`)) + geom_point()

What's the issue?

like image 789
thc Avatar asked Aug 02 '18 16:08

thc


2 Answers

UPDATE: Note that in more recent version of ggplot2, the use of aes_string is discouraged. Instead if you need to get a column value from a string, use the .data pronoun

ggplot(data=mydat, aes(x=,.data[[xcol]], y=.data[[ycol]])) + geom_point()

ORIGINAL ANSWER: Values passed to aes_string are parse()-d. This is because you can pass things like aes_string(x="log(price)") where you aren't passing a column name but an expression. So it treats your string like an expression and when it goes to parse it, it finds the space and that's an invalid expression. You can "fix" this by wrapping column names in quotes. For example, this works

mydat <- data.frame(`Col 1`=1:5, `Col 2`=1:5, check.names=F)
xcol <- "Col 1"
ycol <- "Col 2"
ggplot(data=mydat, aes_string(x=shQuote(xcol), y=shQuote(ycol))) + geom_point()

We just use shQuote() to but double quotes around our values. You could have also embedded the single ticks like you did in the other example in your string

mydat <- data.frame(`Col 1`=1:5, `Col 2`=1:5, check.names=F)
xcol <- "`Col 1`"
ycol <- "`Col 2`"
ggplot(data=mydat, aes_string(x=xcol, y=ycol)) + geom_point()

But the real best way to deal with this is to not use column names that are not valid variable names.

like image 177
MrFlick Avatar answered Sep 19 '22 02:09

MrFlick


Here's a tidyeval approach, which is what the tidyverse development crew is moving towards in place of aes_ or aes_string. Tidyeval is tricky at first, but pretty well documented.

This recipe sheet isn't ggplot-specific, but it's on my bookmarks toolbar because it's pretty handy.

In this case, you want to write a function to handle making your plot. This function takes a data frame and two bare column names as arguments. Then you turn the column names into quosures with enquo, then !! unquotes them for use in aes.

library(ggplot2)

mydat <- data.frame(`Col 1`=1:5, `Col 2`=1:5, check.names=F)

pts <- function(data, xcol, ycol) {
  x_var <- enquo(xcol)
  y_var <- enquo(ycol)
  ggplot(data, aes(x = !!x_var, y = !!y_var)) +
    geom_point()
}

pts(mydat, `Col 1`, `Col 2`)

But also like @MrFlick said, do whatever you can to just use valid column names, because why not?

like image 33
camille Avatar answered Sep 19 '22 02:09

camille