I'm trying to filter out whole rows in R, but only if the frequencies for a particular set don't add up to more than 5.
The data I have looks a bit like this. It's a dataframe that I'm currently calling "Words":
HEADWORD VARIANT FREQUENCY
SWORD sword 2
SWORD swerd 1
SWORD sworde 1
KNIGHT knight 6
KNIGHT kniht 2
KNIGHT knyt 1
I only want rows for which the frequencies within a particular headword add up to more than 5. So here, I want to keep all the instances of KNIGHT but I want to get rid of all the SWORD rows entirely.
I tried to do this on dplyr, but with no success. This is the code I tried:
Words1 %>% group_by(HW) %>% filter(Fr > 5)
We need to get the sum
of 'FREQUENCY' and check whether it is greater than 5 in the filter
after grouping by 'HEADWORD'
Words1 %>%
group_by(HEADWORD) %>%
filter(sum(FREQUENCY) >5)
# HEADWORD VARIANT FREQUENCY
# <chr> <chr> <int>
#1 KNIGHT knight 6
#2 KNIGHT kniht 2
#3 KNIGHT knyt 1
You can use base R
ave
function
df[ave(df$FREQUENCY, df$HEADWORD, FUN = sum) > 5, ]
# HEADWORD VARIANT FREQUENCY
#4 KNIGHT knight 6
#5 KNIGHT kniht 2
#6 KNIGHT knyt 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With