Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to count the number of unique values by group? [duplicate]

Tags:

r

ID= c('A', 'A', 'A', 'B', 'B', 'B') color=c('white', 'green', 'orange', 'white', 'green', 'green')  d = data.frame (ID, color) 

My desired result is

unique_colors=c(3,3,3,2,2,2) d = data.frame (ID, color, unique_colors) 

or more clear in a new dataframe c

ID= c('A','B') unique_colors=c(3,2) c = data.frame (ID,unique_colors) 

I've tried different combinations of aggregate and ave as well as by and with and I suppose it is a combination of those functions.

The solution would include:

length(unique(d$color)) 

to calculate the number of unique elements.

like image 747
rmuc8 Avatar asked Jan 27 '15 15:01

rmuc8


People also ask

How do I count unique values in a unique function in Excel?

Unique value in excel appears in a list of items only once and the formula for counting unique values in Excel is “=SUM(IF(COUNTIF(range,range)=1,1,0))”. The purpose of counting unique and distinct values is to separate them from the duplicates of a list of Excel.


1 Answers

I think you've got it all wrong here. There is no need neither in plyr or <- when using data.table.

Recent versions of data.table, v >= 1.9.6, have a new function uniqueN() just for that.

library(data.table) ## >= v1.9.6 setDT(d)[, .(count = uniqueN(color)), by = ID] #    ID count # 1:  A     3 # 2:  B     2 

If you want to create a new column with the counts, use the := operator

setDT(d)[, count := uniqueN(color), by = ID] 

Or with dplyr use the n_distinct function

library(dplyr) d %>%   group_by(ID) %>%   summarise(count = n_distinct(color)) # Source: local data table [2 x 2] #  #   ID count # 1  A     3 # 2  B     2 

Or (if you want a new column) use mutate instead of summary

d %>%   group_by(ID) %>%   mutate(count = n_distinct(color)) 
like image 107
David Arenburg Avatar answered Oct 05 '22 18:10

David Arenburg