Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I obtain summary of each unique id

Tags:

r

unique

plyr

I would like to extract some summary statistics for a number of values in multiple columns. My data looks as follows

id                pace       type                   value      abundance 
51                (T)        (JC)                   (L)           0        
51                (T)        (JC)                   (L)           0 
51                (T)        (JC)                   (H)           0
52                (T)        (JC)                   (H)           0
52                (R)        (JC)                   (H)           0
53                (T)        (JC)                   (L)           1
53                (T)        (JC)                   (H)           1
53                (R)        (JC)                   (H)           1
53                (R)        (JC)                   (H)           1
53                (R)        (JC)                   (H)           1
54                (T)        (BC)                 <blank>         0          

54                (T)        (BC)                 <blank>         0 
54                (T)        (BC)                 <blank>         0

and I am hoping for something like this

id    ptype       (T)    (R)        (L)      (H)     abundance
51     (JC)        3      0          2        1         0
52     (JC)        1      1          0        2         0
53     (JC)        2      3          1        4         1
54     (BC)        3      0          0        0         0

I have begun writing some code:

for (i in levels(df$id))
{
  extract.event <- df[df$id==i,]# To identify each section
ppace <- table(extract.event$pace) #count table of pace 
ptype <- extract.event$type[1] # extract the first line to be the type
nvalues <- table(extract.event$value) #count table of value
nabundance <- min(extract.event$abundance) #minimum of abundance

d <- cbind(ppace,ptype,forbeh,nvalues,nabundance)

but I am running into problems merging the values, especially when the nabundance prints out an empty table. I would prefer not to extract by name as there are so many names in the data frame. Any ideas? I thought it might be something to do with plyr package, but still not sure...

Thanks,

Grace

like image 836
Grace Sutton Avatar asked Oct 17 '22 13:10

Grace Sutton


1 Answers

I had to rewrite your data.frame (for future reference please paste the results of dput because we hate rewriting your data) but here is my attempt. I'm guessing you are looking for something along the lines of the aggregate function:

df <- data.frame(id = as.factor(c(51,51,51,52,52,53,53,53,53,53,54,54,54)), 
      pace = c("(T)","(T)","(T)","(T)","(R)","(T)","(T)","(R)","(R)","(R)","(T)","(T)","(T)"), 
      type = c("(JC)","(JC)","(JC)","(JC)","(JC)","(JC)","(JC)","(JC)","(JC)","(JC)","(BC)","(BC)","(BC)"), value = c("(L)","(L)","(H)","(H)","(H)","(L)","(H)","(H)","(H)","(H)","<blank>","<blank>","<blank>"), 
      abundance = c(0,0,0,0,0,1,1,1,1,1,0,0,0))

smallnames <- colnames(do.call("cbind",as.list(aggregate(cbind(value, pace, abundance) ~ id + type, data = lapply(df, as.character), table))))
smallnames
[1] "id"      "type"    "(H)"     "(L)"     "<blank>" "(R)"     "(T)"     "0"      
[9] "1"

df.new <- do.call("data.frame", as.list(aggregate(cbind(value, pace, abundance) ~ id + type, data = lapply(df, as.character), table)))
colnames(df.new) <- smallnames
df.new$abundance <- df.new$`1`
df.new
  id type (H) (L) <blank> (R) (T) 0 1 abundance
1 54 (BC)   0   0       3   0   3 3 0         0
2 51 (JC)   1   2       0   0   3 3 0         0
3 52 (JC)   2   0       0   1   1 2 0         0
4 53 (JC)   4   1       0   3   2 0 5         5

df.final <- df.new[, -which(colnames(df.new) %in% c("<blank>","0","1"))]
df.final
  id type (H) (L) (R) (T) abundance
1 54 (BC)   0   0   0   3         0
2 51 (JC)   1   2   0   3         0
3 52 (JC)   2   0   1   1         0
4 53 (JC)   4   1   3   2         5

Let me know if this is what you are looking for or if you have trouble with it.

like image 171
Evan Friedland Avatar answered Oct 22 '22 04:10

Evan Friedland