Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

count number of rows in a data frame in R based on group [duplicate]

People also ask

How do I count the number of rows in each group in R?

The count() method can be applied to the input dataframe containing one or more columns and returns a frequency count corresponding to each of the groups.

How do I count the number of data in a group in R?

count() lets you quickly count the unique values of one or more variables: df %>% count(a, b) is roughly equivalent to df %>% group_by(a, b) %>% summarise(n = n()) .

How do I count the number of rows in R?

To get number of rows in R Data Frame, call the nrow() function and pass the data frame as argument to this function. nrow() is a function in R base package.

How do I count occurrences in R Dataframe?

To count occurrences between columns, simply use both names, and it provides the frequency between the values of each column. This process produces a dataset of all those comparisons that can be used for further processing. It expands the variety a comparison you can make.


The count() function in plyr does what you want:

library(plyr)

count(mydf, "MONTH-YEAR")

Here's an example that shows how table(.) (or, more closely matching your desired output, data.frame(table(.)) does what it sounds like you are asking for.

Note also how to share reproducible sample data in a way that others can copy and paste into their session.

Here's the (reproducible) sample data:

mydf <- structure(list(ID = c(110L, 111L, 121L, 131L, 141L), 
                       MONTH.YEAR = c("JAN. 2012", "JAN. 2012", 
                                      "FEB. 2012", "FEB. 2012", 
                                      "MAR. 2012"), 
                       VALUE = c(1000L, 2000L, 3000L, 4000L, 5000L)), 
                  .Names = c("ID", "MONTH.YEAR", "VALUE"), 
                  class = "data.frame", row.names = c(NA, -5L))

mydf
#    ID MONTH.YEAR VALUE
# 1 110  JAN. 2012  1000
# 2 111  JAN. 2012  2000
# 3 121  FEB. 2012  3000
# 4 131  FEB. 2012  4000
# 5 141  MAR. 2012  5000

Here's the calculation of the number of rows per group, in two output display formats:

table(mydf$MONTH.YEAR)
# 
# FEB. 2012 JAN. 2012 MAR. 2012 
#         2         2         1

data.frame(table(mydf$MONTH.YEAR))
#        Var1 Freq
# 1 FEB. 2012    2
# 2 JAN. 2012    2
# 3 MAR. 2012    1

Using the example data set that Ananda dummied up, here's an example using aggregate(), which is part of core R. aggregate() just needs something to count as function of the different values of MONTH-YEAR. In this case, I used VALUE as the thing to count:

aggregate(cbind(count = VALUE) ~ MONTH.YEAR, 
          data = mydf, 
          FUN = function(x){NROW(x)})

which gives you..

  MONTH.YEAR count
1  FEB. 2012     2
2  JAN. 2012     2
3  MAR. 2012     1

Try using the count function in dplyr:

library(dplyr)
dat1_frame %>% 
    count(MONTH.YEAR)

I am not sure how you got MONTH-YEAR as a variable name. My R version does not allow for such a variable name, so I replaced it with MONTH.YEAR.

As a side note, the mistake in your code was that dat1_frame %.% group_by(MONTH-YEAR) without a summarise function returns the original data frame without any modifications. So, you want to use

dat1_frame %>%
    group_by(MONTH.YEAR) %>%
    summarise(count=n())