Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

counting unique factors in r

Tags:

r

unique

r-factor

I would like to know the number of unique dams which gave birth on each of the birth dates recorded. My data frame is similar to this one:

dam <- c("2A11","2A11","2A12","2A12","2A12","4D23","4D23","1X23")
bdate <- c("2009-10-01","2009-10-01","2009-10-01","2009-10-01",
           "2009-10-01","2009-10-03","2009-10-03","2009-10-03")
mydf <- data.frame(dam,bdate)
mydf
#    dam      bdate
# 1 2A11 2009-10-01
# 2 2A11 2009-10-01
# 3 2A12 2009-10-01
# 4 2A12 2009-10-01
# 5 2A12 2009-10-01
# 6 4D23 2009-10-03
# 7 4D23 2009-10-03
# 8 1X23 2009-10-03

I used aggregate(dam ~ bdate, data=mydf, FUN=length) but it counts all the dams that gave birth on a particular date

bdate dam
1 2009-10-01   5
2 2009-10-03   3

Instead, I need to have something like this:

mydf2
  bdate      dam
1 2009-10-01  2
2 2009-10-03  2

Your help is very much appreciated!

like image 959
baz Avatar asked May 05 '11 02:05

baz


2 Answers

In dplyr you can use n_distinct :

library(tidyverse)
mydf %>%
  group_by(bdate) %>%
  summarize(dam = n_distinct(dam))
like image 158
Preston Avatar answered Oct 18 '22 18:10

Preston


You could also run unique on the data first:

aggregate(dam ~ bdate, data=unique(mydf[c("dam","date")]), FUN=length)

Then you could also use table instead of aggregate, though the output is a little different.

> table(unique(mydf[c("dam","date")])$bdate)

2009-10-01 2009-10-03 
         2          2 
like image 35
Aaron left Stack Overflow Avatar answered Oct 18 '22 20:10

Aaron left Stack Overflow