Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to filter dataframe with multiple conditions?

I have this dataframe that I'll like to subset (if possible, with dplyr or base R functions):

df <- data.frame(x = c(1,1,1,2,2,2), y = c(30,10,8,10,18,5))

x  y
1 30
1 10
1  8
2 10
2 18
2  5

Assuming x are factors (so 2 conditions/levels), how can I subset/filter this dataframe so that I get only df$y values that are greater than 15 for df$x == 1, and df$y values that are greater than 5 for df$x == 2?

This is what I'd like to get:

df2 <- data.frame(x = c(1,2,2), y = c(30,10,18))

x y
1 30
2 10
2 18

Appreciate any help! Thanks!

like image 676
hsl Avatar asked Oct 20 '22 12:10

hsl


1 Answers

If you have several 'x' groups, one option would be to use mapply. We split the 'y' using 'x' as grouping variable, create the vector of values to compare against (c(15,5)) and use mapply to get the logical index for subsetting the 'df'.

df[unlist(mapply('>', split(df$y, df$x), c(15,5))),]
#  x  y
#1 1 30
#4 2 10
#5 2 18
like image 135
akrun Avatar answered Oct 22 '22 21:10

akrun