I have the following data frame:
id total_transfered_amount day
1 1000 2
1 2000 3
1 3000 4
1 1000 1
1 10000 4
2 5000 3
2 6000 4
2 40000 2
2 4000 3
2 4000 3
3 1000 1
3 2000 2
3 3000 3
3 30000 3
3 3000 3
Need to filter out rows that fall above 90 percentile in 'total_transfered_amount' column for every id seperately using dplyr package preferabely , for example I need to filter out following rows:
2 40000 2
3 30000 3
Checkt this out. I do not understand why you have first row in your output
dane <- data.frame(id = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3,3 ), total_trans = c(1000, 2000, 3000, 1000, 10000, 5000, 6000,
40000, 4000, 4000, 1000, 2000, 3000, 30000, 3000),
day = c(2, 3,4, 1, 4, 3, 4, 2, 3, 3, 1, 2, 3, 3, 3))
library(dplyr)
dane %>% group_by(id) %>% filter(quantile(total_trans, 0.9)<total_trans)
id total_trans day
(dbl) (dbl) (dbl)
1 1 10000 4
2 2 40000 2
3 3 30000 3
We can use data.table
library(data.table)
setDT(df1)[,.SD[quantile(total_transfered_amount, 0.9) <
total_transfered_amount] , by = id]
# id total_transfered_amount day
#1: 1 10000 4
#2: 2 40000 2
#3: 3 30000 3
Or we can use base R
df1[with(df1, as.logical(ave(total_transfered_amount, id,
FUN=function(x) quantile(x, 0.9) < x))),]
# id total_transfered_amount day
#5 1 10000 4
#8 2 40000 2
#14 3 30000 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With