I'm working in R with the following dataset for a metabolomics study.
first Name Area Sample Similarity
120 Pentanone 699468 PO4:1 954
120 Pentanone 153744 PO2:1 981
126 Methylamine 83528 PO4:1 887
126 Unknown 32741 PO2:1 645
126 Sulfurous 43634 PO1:1 800
I want to be able to selected in the first column, within the rowns with same value (for example 120), the compounds with same name (for example pentanone). From this selection I want to copy the row information that corresponds to the highest similarity and created new columns within the table. In this case the following information:
120 Pentanone 153744 PO2:1 981
I know that "send me the code posts" are not very appreciated by I would greatly appreciated some clues on how to start.
You can use plyr package:
I reproduce your data ( try to use dput(dat) next time)
dat <- read.table(text ='first Name Area Sample Similarity
120 Pentanone 699468 PO4:1 954
120 Pentanone 153744 PO2:1 981
126 Methylamine 83528 PO4:1 887
126 Unknown 32741 PO2:1 645
126 Sulfurous 43634 PO1:1 800',header=TRUE)
I aggregate in a new data.frame
library(plyr)
ddply(dat,.(first,Name),function(x) x[x$Similarity==max(x$Similarity),])
first Name Area Sample Similarity
1 120 Pentanone 153744 PO2:1 981
2 126 Methylamine 83528 PO4:1 887
3 126 Sulfurous 43634 PO1:1 800
4 126 Unknown 32741 PO2:1 645
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With