Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R For loop fails applying max function

I premise I'm new with R and actually I'm trying to get the fundamentals. Currently I'm workin on a large dataframe (called "ppl") which I have to edit in order to filter some rows. Each row is included in a group and it is characterized by an intensity (into) value and a sample value.

       mz  rt      into   sample  tracker     sn   grp
 100.0153 126  2.762664      3    11908 7.522655   0
 100.0171 127  2.972048      2    5308  7.718521   0
 100.0788 272 30.217969      2    5309 19.024807   1
 100.0796 272 17.277916      3   11910  7.297716   1
 101.0042 128 37.557324      3   11916 27.991320   2
 101.0043 128 39.676014      2    5316 28.234918   2

Well, the first question is: "How can I select from each group the sample with the highest intensity?" I tried a for loop:

for (i in ppl$grp) {
temp<-ppl[ppl$grp == i,]
sel<-rbind(sel,temp[max(temp$into),])
}

The fact is that it works for ppl$grp == 0, but the next cycles return NAs rows. Then the filtered dataframe(called "sel") also should store the sample values of the removed rows. It should be as follows:

      mz  rt      into   sample  tracker     sn   grp
100.0171 127  2.972048   c(2,3)    5308  7.718521   0
100.0788 272 30.217969   c(2,3)    5309 19.024807   1
101.0043 128 39.676014   c(2,3)    5316 28.234918   2

In order to get this I would use this approach:

lev<-factor(ppl$grp)
samp<-ppl$sample
samp2<-split(samp,lev)
sel$sample<-samp2

Any hint? Because I cannot test it since I still don't have solved the previous problem.

Thanks a lot.

like image 944
AeonRed Avatar asked Oct 19 '22 01:10

AeonRed


1 Answers

Not sure if I follow your question. But maybe this will get you started.

library(dplyr)
ppl %>% group_by(grp) %>% filter(into == max(into)) 
like image 129
user51855 Avatar answered Oct 21 '22 00:10

user51855