Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count number of rows per group and add result to original data frame

Say I have a data.frame object:

df <- data.frame(name=c('black','black','black','red','red'),                  type=c('chair','chair','sofa','sofa','plate'),                  num=c(4,5,12,4,3)) 

Now I want to count the number of rows (observations) of for each combination of name and type. This can be done like so:

table(df[ , c("name","type")]) 

or possibly also with plyr, (though I am not sure how).

However, how do I get the results incorporated into the original data frame? So that the results will look like this:

df #    name  type num count # 1 black chair   4     2 # 2 black chair   5     2 # 3 black  sofa  12     1 # 4   red  sofa   4     1 # 5   red plate   3     1 

where count now stores the results from the aggregation.

A solution with plyr could be interesting to learn as well, though I would like to see how this is done with base R.

like image 578
Uri Laserson Avatar asked Sep 16 '11 21:09

Uri Laserson


People also ask

How do I count the number of rows in each group of a Groupby object?

You can use pandas DataFrame. groupby(). count() to group columns and compute the count or size aggregate, this calculates a rows count for each group combination.

How do I count the number of rows in each group in R?

The count() method can be applied to the input dataframe containing one or more columns and returns a frequency count corresponding to each of the groups.

How do I count by grouping in R?

count() lets you quickly count the unique values of one or more variables: df %>% count(a, b) is roughly equivalent to df %>% group_by(a, b) %>% summarise(n = n()) . count() is paired with tally() , a lower-level helper that is equivalent to df %>% summarise(n = n()) .

Which function counts the number of rows in a group?

The COUNT(*) function returns the number of rows in a table, including the rows including NULL and duplicates.


1 Answers

Using data.table:

library(data.table) dt = as.data.table(df)  # or coerce to data.table by reference: # setDT(df)  dt[ , count := .N, by = .(name, type)] 

For pre-data.table 1.8.2 alternative, see edit history.


Using dplyr:

library(dplyr) df %>%   group_by(name, type) %>%   mutate(count = n()) 

Or simply:

add_count(df, name, type) 

Using plyr:

plyr::ddply(df, .(name, type), transform, count = length(num)) 
like image 145
Ramnath Avatar answered Oct 03 '22 01:10

Ramnath