Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count number of rows by group using dplyr

Tags:

r

count

dplyr

plyr

I am using the mtcars dataset. I want to find the number of records for a particular combination of data. Something very similar to the count(*) group by clause in SQL. ddply() from plyr is working for me

library(plyr) ddply(mtcars, .(cyl,gear),nrow) 

has output

  cyl gear V1 1   4    3  1 2   4    4  8 3   4    5  2 4   6    3  2 5   6    4  4 6   6    5  1 7   8    3 12 8   8    5  2 

Using this code

library(dplyr) g <- group_by(mtcars, cyl, gear) summarise(g, length(gear)) 

has output

  length(cyl) 1          32 

I found various functions to pass in to summarise() but none seem to work for me. One function I found is sum(G), which returned

Error in eval(expr, envir, enclos) : object 'G' not found 

Tried using n(), which returned

Error in n() : This function should not be called directly 

What am I doing wrong? How can I get group_by() / summarise() to work for me?

like image 218
charmee Avatar asked Mar 31 '14 17:03

charmee


People also ask

How do I count the number of rows in each group in R?

The count() method can be applied to the input dataframe containing one or more columns and returns a frequency count corresponding to each of the groups. The columns returned on the application of this method is a proper subset of the columns of the original dataframe.

How do I count by grouping in R?

count() lets you quickly count the unique values of one or more variables: df %>% count(a, b) is roughly equivalent to df %>% group_by(a, b) %>% summarise(n = n()) . count() is paired with tally() , a lower-level helper that is equivalent to df %>% summarise(n = n()) .

How do I find the number of rows in R?

To get number of rows in R Data Frame, call the nrow() function and pass the data frame as argument to this function. nrow() is a function in R base package.

How do I count columns and rows in R?

The ncol() function in R programming That is, ncol() function returns the total number of columns present in the object.


2 Answers

There's a special function n() in dplyr to count rows (potentially within groups):

library(dplyr) mtcars %>%    group_by(cyl, gear) %>%    summarise(n = n()) #Source: local data frame [8 x 3] #Groups: cyl [?] # #    cyl  gear     n #  (dbl) (dbl) (int) #1     4     3     1 #2     4     4     8 #3     4     5     2 #4     6     3     2 #5     6     4     4 #6     6     5     1 #7     8     3    12 #8     8     5     2 

But dplyr also offers a handy count function which does exactly the same with less typing:

count(mtcars, cyl, gear)          # or mtcars %>% count(cyl, gear) #Source: local data frame [8 x 3] #Groups: cyl [?] # #    cyl  gear     n #  (dbl) (dbl) (int) #1     4     3     1 #2     4     4     8 #3     4     5     2 #4     6     3     2 #5     6     4     4 #6     6     5     1 #7     8     3    12 #8     8     5     2 
like image 80
talat Avatar answered Sep 24 '22 21:09

talat


another approach is to use the double colons:

mtcars %>%    dplyr::group_by(cyl, gear) %>%   dplyr::summarise(length(gear)) 
like image 41
user3026255 Avatar answered Sep 26 '22 21:09

user3026255