I have a problem to plot a subset of a data frame with ggplot2. My df is like:
df = data.frame(ID = c('P1', 'P1', 'P2', 'P2', 'P3', 'P3'),
                Value1 = c(100, 120, 300, 400, 130, 140),
                Value2 = c(12, 13, 11, 16, 15, 12))
How can I now plot Value1 vs Value2 only for IDs 'P1' and 'P3'?
For example I tried:
ggplot(subset(df,ID=="P1 & P3") +
  geom_line(aes(Value1, Value2, group=ID, colour=ID)))
but I always receive an error.
Method 1: Using subset() function Here, we use subset() function for plotting only subset of DataFrame inside ggplot() function inplace of data DataFrame. All other things are same. Parameters: It takes data object to be subsetted as it's first parameter.
ggplot2 allows you to do data manipulation, such as filtering or slicing, within the data argument.
Here 2 options for subsetting:
Using subset from base R:
library(ggplot2)
ggplot(subset(dat,ID %in% c("P1" , "P3"))) + 
         geom_line(aes(Value1, Value2, group=ID, colour=ID))
Using subset the argument of geom_line(Note I am using plyr package to use the special . function).
library(plyr)
ggplot(data=dat)+ 
  geom_line(aes(Value1, Value2, group=ID, colour=ID),
                ,subset = .(ID %in% c("P1" , "P3")))
You can also use the complementary subsetting:
subset(dat,ID != "P2")
                        There's another solution that I find useful, especially when I want to plot multiple subsets of the same object:
myplot<-ggplot(df)+geom_line(aes(Value1, Value2, group=ID, colour=ID))
myplot %+% subset(df, ID %in% c("P1","P3"))
myplot %+% subset(df, ID %in% c("P2"))
                        With option 2 in @agstudy's answer now deprecated, defining data with a function can be handy.
library(plyr)
ggplot(data=dat) + 
  geom_line(aes(Value1, Value2, group=ID, colour=ID),
            data=function(x){x$ID %in% c("P1", "P3"))
This approach comes in handy if you wish to reuse a dataset in the same plot, e.g. you don't want to specify a new column in the data.frame, or you want to explicitly plot one dataset in a layer above the other.:
library(plyr)
ggplot(data=dat, aes(Value1, Value2, group=ID, colour=ID)) + 
  geom_line(data=function(x){x[!x$ID %in% c("P1", "P3"), ]}, alpha=0.5) +
  geom_line(data=function(x){x[x$ID %in% c("P1", "P3"), ]})
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With