I would like to know the number of unique dams which gave birth on each of the birth dates recorded. My data frame is similar to this one:
dam <- c("2A11","2A11","2A12","2A12","2A12","4D23","4D23","1X23")
bdate <- c("2009-10-01","2009-10-01","2009-10-01","2009-10-01",
"2009-10-01","2009-10-03","2009-10-03","2009-10-03")
mydf <- data.frame(dam,bdate)
mydf
# dam bdate
# 1 2A11 2009-10-01
# 2 2A11 2009-10-01
# 3 2A12 2009-10-01
# 4 2A12 2009-10-01
# 5 2A12 2009-10-01
# 6 4D23 2009-10-03
# 7 4D23 2009-10-03
# 8 1X23 2009-10-03
I used aggregate(dam ~ bdate, data=mydf, FUN=length)
but it counts all the dams that gave birth on a particular date
bdate dam
1 2009-10-01 5
2 2009-10-03 3
Instead, I need to have something like this:
mydf2
bdate dam
1 2009-10-01 2
2 2009-10-03 2
Your help is very much appreciated!
In dplyr you can use n_distinct
:
library(tidyverse)
mydf %>%
group_by(bdate) %>%
summarize(dam = n_distinct(dam))
You could also run unique
on the data first:
aggregate(dam ~ bdate, data=unique(mydf[c("dam","date")]), FUN=length)
Then you could also use table
instead of aggregate
, though the output is a little different.
> table(unique(mydf[c("dam","date")])$bdate)
2009-10-01 2009-10-03
2 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With