Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I aggregate a dataframe and retain string variables in R?

I have a data frame of the form:

  Family Code Length Type
1      A    1     11 Alpha
2      A    3      8 Beta
3      A    3      9 Beta
4      B    4      7 Alpha
5      B    5      8 Alpha
6      C    6      2 Beta
7      C    6      5 Beta
8      C    6      4 Beta

I would like to reduce the data set to one containing unique values of Code by taking a mean of Length values, but to retain all string variables too, i.e.

  Family Code Length Type
1      A    1     11 Alpha
2      A    3    8.5 Beta
3      B    4      7 Alpha
5      B    5      8 Alpha
6      C    6   3.67 Beta

I've tried aggregate() and ddply() but these seem to replace strings with NA and I'm struggling to find a way round this.

like image 718
R_usr Avatar asked Oct 24 '11 21:10

R_usr


People also ask

What is the purpose of aggregate () in R?

aggregate() function is used to get the summary statistics of the data by group. The statistics include mean, min, sum.


1 Answers

Since Family and Type are constant within a Code group, you can "group" on those as well without changing anything when you use ddply. If your original data set was dat

ddply(dat, .(Family, Code, Type), summarize, Length=mean(Length))

gives

  Family Code  Type    Length
1      A    1 Alpha 11.000000
2      A    3  Beta  8.500000
3      B    4 Alpha  7.000000
4      B    5 Alpha  8.000000
5      C    6  Beta  3.666667

If Family and Type are not constant within a Code group, then you would need to define how to summarize/aggregate those values. In this example, I just take the single unique value:

ddply(dat, .(Code), summarize, Family=unique(Family), 
  Length=mean(Length), Type=unique(Type))

Update

Similar options using dplyr are

 library(dplyr)
 dat %>% 
     group_by(Family, Code, Type) %>%
     summarise(Length=mean(Length))

and

  dat %>%
     group_by(Code) %>%
     summarise(Family=unique(Family), Length=mean(Length), Type=unique(Type))
like image 104
Brian Diggs Avatar answered Sep 27 '22 22:09

Brian Diggs