R: Count unique values by category

Question

I have data in R that looks like this:

 Cnty   Yr   Plt       Spp  DBH Ht Age
 1  185 1999 20001 Bitternut  8.0 54  47
 2  185 1999 20001 Bitternut  7.2 55  50
 3   31 1999 20001    Pignut  7.4 71  60
 4   31 1999 20001    Pignut 11.4 85 114
 5  189 1999 20001        WO 14.5 80  82
 6  189 1999 20001        WO 12.1 72  79

I would like to know the quantity of unique species (Spp) in each county (Cnty). "unique(dfname$Spp)" gives me a total count of unique species in the data frame, but I would like it by county.

Any help is appreciated! Sorry for the weird formatting, this is my first ever question on SO.

Thanks.

A5C1D2H2I1M1N2O1R2T1 · Accepted Answer

I've tried to make your sample data a little bit more interesting. Your sample data presently has just one unique "Spp" per "Cnty".

set.seed(1)
mydf <- data.frame(
  Cnty = rep(c("185", "31", "189"), times = c(5, 3, 2)),
  Yr = c(rep(c("1999", "2000"), times = c(3, 2)), 
         "1999", "1999", "2000", "2000", "2000"),
  Plt = "20001",
  Spp = sample(c("Bitternut", "Pignut", "WO"), 10, replace = TRUE),
  DBH = runif(10, 0, 15)
)
mydf
#    Cnty   Yr   Plt       Spp       DBH
# 1   185 1999 20001 Bitternut  3.089619
# 2   185 1999 20001    Pignut  2.648351
# 3   185 1999 20001    Pignut 10.305343
# 4   185 2000 20001        WO  5.761556
# 5   185 2000 20001 Bitternut 11.547621
# 6    31 1999 20001        WO  7.465489
# 7    31 1999 20001        WO 10.764278
# 8    31 2000 20001    Pignut 14.878591
# 9   189 2000 20001    Pignut  5.700528
# 10  189 2000 20001 Bitternut 11.661678

Next, as suggested, tapply is a good candidate here. Combine unique and length to get the data you are looking for.

with(mydf, tapply(Spp, Cnty, FUN = function(x) length(unique(x))))
# 185 189  31 
#   3   2   2 
with(mydf, tapply(Spp, list(Cnty, Yr), FUN = function(x) length(unique(x))))
#     1999 2000
# 185    2    2
# 189   NA    2
# 31     1    1

If you're interested in simple tabulation (not of unique values), then you can explore table and ftable:

with(mydf, table(Spp, Cnty))
#            Cnty
# Spp         185 189 31
#   Bitternut   2   1  0
#   Pignut      2   1  1
#   WO          1   0  2
ftable(mydf, row.vars="Spp", col.vars=c("Cnty", "Yr"))
#           Cnty  185       189        31     
#           Yr   1999 2000 1999 2000 1999 2000
# Spp                                         
# Bitternut         1    1    0    1    0    0
# Pignut            2    0    0    1    0    1
# WO                0    1    0    0    2    0

Arhopala · Answer

As Justin mentioned aggregate is probably what you want. If you call your data frame foo, then the following should give you what you want, namely the number of individuals per species assuming that each row with Butternut represents a unique individual belonging to the butternut species. Note I used foo$Age to calculate the length of the vector, i.e. the number of individuals (row) belonging to each species, but you could use foo$Ht or foo$DBH etc.

aggregate(foo$Age, by = foo[c('Spp','Cnty')], length)

Cheers,

Danny

aggregate(foo$Age, by = foo[c('Spp','Cnty')], length)

Cheers,

Danny

R: Count unique values by category

Tags:

r

unique

count

categories

Klaus Louis

2 Answers

A5C1D2H2I1M1N2O1R2T1

Arhopala

Recent Activity

Donate For Us

R: Count unique values by category

Tags:

r

unique

count

categories

Klaus Louis

2 Answers

A5C1D2H2I1M1N2O1R2T1

Arhopala

Related questions

Recent Activity

Donate For Us