I'm trying to filter row using the count()
helper.
What I would like as output are all the rows where the map %>% count(StudentID) = 3
.
For instance in the df below, it should take out all the rows with StudentID 10016 and 10020 as they are only 2 instances of these and I want 3.
StudentID StudentGender Grade TermName ScaleName TestRITScore
100 M 9 Fall 2010 Language Usage 217
100 M 10 2011-2012 Language Usage 220
100 M 9 Fall 2010 Reading 210
10016 M 6 Fall 2010 Language Usage 217
10016 M 6 Fall 2010 Mathematics 210
10020 F 7 Fall 2010 Language Usage 210
10020 F 7 Fall 2010 Mathematics 213
10022 F 8 Fall 2010 Language Usage 232
10022 F 9 2011-2012 Language Usage 240
10022 F 8 Fall 2010 Mathematics 242
if I do:
count(df, StudentID)
then it only gives me a df with 2 columns, but I want to keep all the columns of my df. thats's why I think I should use filter.
To count the number of rows, use the id column which stores unique values (in our example we use COUNT(id) ). Next, use the GROUP BY clause to group records according to columns (the GROUP BY category above). After using GROUP BY to filter records with aggregate functions like COUNT, use the HAVING clause.
I don't think count
is what you looking for. Try n()
instead:
df %>%
group_by(StudentID) %>%
filter(n() == 3)
# Source: local data frame [6 x 6]
# Groups: StudentID
#
# StudentID StudentGender Grade TermName ScaleName TestRITScore
# 1 100 M 9 Fall 2010 Language Usage 217
# 2 100 M 10 2011-2012 Language Usage 220
# 3 100 M 9 Fall 2010 Reading 210
# 4 10022 F 8 Fall 2010 Language Usage 232
# 5 10022 F 9 2011-2012 Language Usage 240
# 6 10022 F 8 Fall 2010 Mathematics 242
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With