Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using dplyr, how to pipe or chain to plot()?

Tags:

plot

r

dplyr

piping

I am new to dplyr() package and trying to use it for my visualization assignment. I am able to pipe my data to ggplot() but unable to do that with plot(). I came across this post and the answers including the one in comments, didn't work for me.

Code 1:

emission <- mynei %>%
    select(Emissions, year) %>%
    group_by(year) %>%
    summarise (total=sum(Emissions))

emission %>%
    plot(year, total,.)

I get the following error:

Error in plot(year, total, emission) : object 'year' not found

Code 2:

mynei %>%
    select(Emissions, year) %>%
    group_by(year) %>%
    summarise (total=sum(Emissions))%>%
    plot(year, total, .)

This didn't work either and returned the same error.

Interestingly, the solution from the post I mentioned works for the same dataset but doesn't work out for my own data. However, I am able to create the plot using emission$year and emission$total.

Am I missing anything?

like image 676
sadiqsaleem Avatar asked Nov 14 '14 23:11

sadiqsaleem


2 Answers

plot.default doesn't take a data argument, so your best bet is to pipe to with:

mynei %>%
    select(Emissions, year) %>%
    group_by(year) %>%
    summarise (total=sum(Emissions))%>%
    with(plot(year, total))

In case anyone missed @aosmith's comment on the question, plot.formula does have a data argument, but of course the formula is the first argument so we need to use the . to put the data in the right place. So another option is

... %>%
  plot(total ~ year, data = .)

Of course, ggplot takes data as the first argument, so to use ggplot do:

... %>%
  ggplot(aes(x = year, y = total)) + geom_point()

lattice::xyplot is likeplot.formula: there is a data argument, but it's not first, so:

... %>% 
  xyplot(total ~ year, data = .)

Just look at the documentation and make sure you use a . if data isn't the first argument. If there's no data argument at all, using with is a good work-around.

like image 57
Gregor Thomas Avatar answered Oct 31 '22 12:10

Gregor Thomas


As an alternative, you can use the %$% operator from magrittr to be able to access the columns of a dataframe directly. For example:

iris %$%
  plot(Sepal.Length~Sepal.Width)

This is useful many times when you need to feed the result of a dplyr chain to a base R function (such as table, lm, plot, etc). It can also be used to extract a column from a dataframe as a vector, e.g.:

iris %>% filter(Species=='virginica') %$% Sepal.Length

This is the same as:

iris %>% filter(Species=='virginica') %>% pull(Sepal.Length)

like image 22
Vlad C. Avatar answered Oct 31 '22 14:10

Vlad C.