Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The right way to plot multiple y values as separate lines with ggplot2

Tags:

r

ggplot2

I often run into an issue where I have a data frame that has a single x variable, one or more facet variables, and multiple different other variables. Sometimes I would like to simultaneously plot different y variables as separate lines. But it is always only a subset I want. I've tried using melt to get "variable" as a column and use that, and it works if I want every single column that was in the original dataset. Usually I don't.

Right now I've been doing things really roundabout it feels like. Suppose with mtcars I want to plot disp, hp, and wt against mpg:

ggplot(mtcars, aes(x=mpg)) + 
  geom_line(aes(y=disp, color="disp")) + 
  geom_line(aes(y=hp, color="hp")) + 
  geom_line(aes(y=wt, color="wt"))

This feels really redundant. If I first melt mtcars, then all variables will get melted, and then I will wind up plotting other variables that I don't want to.

Does anyone have a good way of doing this?

like image 363
Chris Neff Avatar asked Sep 27 '11 13:09

Chris Neff


People also ask

How do you plot multiple lines on a graph in R?

In this method to create a ggplot with multiple lines, the user needs to first install and import the reshape2 package in the R console and call the melt() function with the required parameters to format the given data to long data form and then use the ggplot() function to plot the ggplot of the formatted data.

Can you plot multiple variables in R?

You can create a scatter plot in R with multiple variables, known as pairwise scatter plot or scatterplot matrix, with the pairs function. In addition, in case your dataset contains a factor variable, you can specify the variable in the col argument as follows to plot the groups with different color.

What does Geom_point () do in R?

The function geom_point() adds a layer of points to your plot, which creates a scatterplot.


2 Answers

ggplot always prefers long format dataframe, so melt it:

library(reshape2)
mtcars.long <- melt(mtcars, id = "mpg", measure = c("disp", "hp", "wt"))
ggplot(mtcars.long, aes(mpg, value, colour = variable)) + geom_line()

There are many other options for doing this transformation. You can see the R-FAQ on converting data from wide to long for an overview.

like image 64
kohske Avatar answered Sep 22 '22 23:09

kohske


With reshape2 being deprecated, I updated @kohske answer using pivot_longer from tidyverse package.

Pivoting is explained here and involves specifying the data to reshape, second argument describes which columns need to be reshape (use - to exclude that column). Third is names_to gives the name of the variable that will be created from the data stored in the column names. Finally values_to gives the name of the variable that will be created from the data stored in the cell value, i.e. count. They also have more complex examples like numbers in column names e.g. wk1 wk2 etc.

# new suggestion
library(tidyverse)

# I subset to just the variables wanted so e.g. gear and cab are not included
mtcars.long <- mtcars %>% 
  select("mpg","disp", "hp", "wt") %>% 
  pivot_longer(-mpg, names_to = "variable", values_to = "value")

head(mtcars.long)
# # A tibble: 6 x 3
# mpg variable  value
# <dbl> <chr>     <dbl>
#   1    21 disp     160   
# 2    21 hp       110   
# 3    21 wt         2.62
# 4    21 disp     160   
# 5    21 hp       110   
# 6    21 wt         2.88


ggplot(mtcars.long, aes(mpg, value, colour = variable)) + geom_line()

Chart is:

mtcarstestchart

like image 23
micstr Avatar answered Sep 21 '22 23:09

micstr