To count the number of rows, use the id column which stores unique values (in our example we use COUNT(id) ). Next, use the GROUP BY clause to group records according to columns (the GROUP BY category above). After using GROUP BY to filter records with aggregate functions like COUNT, use the HAVING clause.
In MySQL, the COUNT() function calculates the number of results from a table when executing a SELECT statement. It does not contain NULL values. The function returns a BIGINT value. It can count all the matched rows or only rows that match the specified conditions.
If you need a quick way to count rows that contain data, select all the cells in the first column of that data (it may not be column A). Just click the column header. The status bar, in the lower-right corner of your Excel window, will tell you the row count.
SQL – count() with Group By clause The count() function is an aggregate function use to find the count of the rows that satisfy the fixed conditions. The count() function with the GROUP BY clause is used to count the data which were grouped on a particular attribute of the table.
Current best practice (tidyverse) is:
require(dplyr)
df1 %>% count(Year, Month)
Following @Joshua's suggestion, here's one way you might count the number of observations in your df
dataframe where Year
= 2007 and Month
= Nov (assuming they are columns):
nrow(df[,df$YEAR == 2007 & df$Month == "Nov"])
and with aggregate
, following @GregSnow:
aggregate(x ~ Year + Month, data = df, FUN = length)
dplyr
package does this with count
/tally
commands, or the n()
function:
First, some data:
df <- data.frame(x = rep(1:6, rep(c(1, 2, 3), 2)), year = 1993:2004, month = c(1, 1:11))
Now the count:
library(dplyr)
count(df, year, month)
#piping
df %>% count(year, month)
We can also use a slightly longer version with piping and the n()
function:
df %>%
group_by(year, month) %>%
summarise(number = n())
or the tally
function:
df %>%
group_by(year, month) %>%
tally()
An old question without a data.table
solution. So here goes...
Using .N
library(data.table)
DT <- data.table(df)
DT[, .N, by = list(year, month)]
The simple option to use with aggregate
is the length
function which will give you the length of the vector in the subset. Sometimes a little more robust is to use function(x) sum( !is.na(x) )
.
An alternative to the aggregate()
function in this case would be table()
with as.data.frame()
, which would also indicate which combinations of Year and Month are associated with zero occurrences
df<-data.frame(x=rep(1:6,rep(c(1,2,3),2)),year=1993:2004,month=c(1,1:11))
myAns<-as.data.frame(table(df[,c("year","month")]))
And without the zero-occurring combinations
myAns[which(myAns$Freq>0),]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With