Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count number of rows within each group

People also ask

How do I count the number of rows in a GROUP BY?

To count the number of rows, use the id column which stores unique values (in our example we use COUNT(id) ). Next, use the GROUP BY clause to group records according to columns (the GROUP BY category above). After using GROUP BY to filter records with aggregate functions like COUNT, use the HAVING clause.

How do I count rows in MySQL by group?

In MySQL, the COUNT() function calculates the number of results from a table when executing a SELECT statement. It does not contain NULL values. The function returns a BIGINT value. It can count all the matched rows or only rows that match the specified conditions.

How do I count the number of rows in each column?

If you need a quick way to count rows that contain data, select all the cells in the first column of that data (it may not be column A). Just click the column header. The status bar, in the lower-right corner of your Excel window, will tell you the row count.

How do you use count in GROUP BY clause?

SQL – count() with Group By clause The count() function is an aggregate function use to find the count of the rows that satisfy the fixed conditions. The count() function with the GROUP BY clause is used to count the data which were grouped on a particular attribute of the table.


Current best practice (tidyverse) is:

require(dplyr)
df1 %>% count(Year, Month)

Following @Joshua's suggestion, here's one way you might count the number of observations in your df dataframe where Year = 2007 and Month = Nov (assuming they are columns):

nrow(df[,df$YEAR == 2007 & df$Month == "Nov"])

and with aggregate, following @GregSnow:

aggregate(x ~ Year + Month, data = df, FUN = length)

dplyr package does this with count/tally commands, or the n() function:

First, some data:

df <- data.frame(x = rep(1:6, rep(c(1, 2, 3), 2)), year = 1993:2004, month = c(1, 1:11))

Now the count:

library(dplyr)
count(df, year, month)
#piping
df %>% count(year, month)

We can also use a slightly longer version with piping and the n() function:

df %>% 
  group_by(year, month) %>%
  summarise(number = n())

or the tally function:

df %>% 
  group_by(year, month) %>%
  tally()

An old question without a data.table solution. So here goes...

Using .N

library(data.table)
DT <- data.table(df)
DT[, .N, by = list(year, month)]

The simple option to use with aggregate is the length function which will give you the length of the vector in the subset. Sometimes a little more robust is to use function(x) sum( !is.na(x) ).


An alternative to the aggregate() function in this case would be table() with as.data.frame(), which would also indicate which combinations of Year and Month are associated with zero occurrences

df<-data.frame(x=rep(1:6,rep(c(1,2,3),2)),year=1993:2004,month=c(1,1:11))

myAns<-as.data.frame(table(df[,c("year","month")]))

And without the zero-occurring combinations

myAns[which(myAns$Freq>0),]