Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

apply sum to data.frame grouped by substring, with R

Tags:

r

Sample datas :

> mat1 = as.data.frame(matrix(c("D-J10-N1","D-J10-N2","D-J2-N1","D-J2-N2",3,6,5,7,8,4,2,3,4,1,2,3), ncol = 4));
> mat1
        V1 V2 V3 V4
1 D-J10-N1  3  8  4
2 D-J10-N2  6  4  1
3  D-J2-N1  5  2  2
4  D-J2-N2  7  3  3

desired output :

> results
        V1 V2 V3 V4
    1 J10  9  12  5
    2 J2   12 5   5

So I need to sum V2 to V4 by a substring of V1 and then return this substring in my results. I can define my groups as :

> groups <- substr(mat1[,1],1,5)
> groups
[1] "D-J10" "D-J10" "D-J2-" "D-J2-"

I thought using rowsum as in :

> rowsum(mat1,groups, reorder = TRUE)

But rowsum seems to accept only numerical values for groups ? I've looked in the apply family functions but found no luck.... Any ideas on how to solve that ?

Thank's a lot for helping !!

like image 993
Chargaff Avatar asked Dec 19 '25 18:12

Chargaff


1 Answers

It helps to have the data.frame set up so the column classes fit a bit better (currently they are all factors).

mat1 <- data.frame(V1=c("D-J10-N1","D-J10-N2","D-J2-N1","D-J2-N2"),V2=c(3,6,5,7),V3=c(8,4,2,3),V4=c(4,1,2,3))

Then you can use aggregate and sub to pick out your substring:

aggregate(mat1[-1],by=list(sub("D-(J[0-9]+)-[A-Z0-9]+","\\1",mat1$V1)),sum)
  Group.1 V2 V3 V4
1     J10  9 12  5
2      J2 12  5  5
like image 122
James Avatar answered Dec 22 '25 08:12

James



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!