Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: for-loop with ddply

Tags:

for-loop

r

plyr

I'm new to R and to stackoverflow so I'm sorry if the question or it's format isn't ideal...

I'm trying to get some basic statistics from a matrix by using ddply and I wanted to make a process a bit faster by using for -loop. Unfortunately this wasn't as easy as I had thought...

Strain  gene1         gene2      gene3  .   .   .
 A    2.6336700     1.42802     0.935742
 A    2.0634700     2.31232     1.096320
 A    2.5798600     2.75138     0.714647
 B    2.6031200     1.31374     1.214920
 B    2.8319400     1.30260     1.191770
 B    1.9796000     1.74199     1.056490
 C    2.4030300     1.20324     1.069800
 .
 .
 .
----------

for (n in c("gene1","gene2","gene3","gene4")) {
  summary <- ddply(Data, .(Strain), summarise,
                mean = mean(n),
                sd   = sd(n),
                se   = sd(n) / sqrt(length(n)) )
}

In results it reads that mean = 6 and both sd and se are "NA" ... obviously not what I had in mind.

If I get rid of the for -loop and manually insert the column name ("gene1"):

summary <- ddply(Data, .(Strain), summarise,
              mean = mean(gene1),
              sd   = sd(gene1),
              se   = sd(gene1) / sqrt(length(gene1)) )

Now it seems to give me the correct result. Can someone enlighten me on this matter and tell me what I'm doing wrong?

like image 737
user2764233 Avatar asked Oct 03 '22 00:10

user2764233


1 Answers

I know you didn't ask for it, but here is a solution with aggregate in base.

# One line in base.
aggregate(Data[paste0('gene',1:3)],by=Data['Strain'],
     function(x) c(mean=mean(x),sd=sd(x),se=sd(x)/sqrt(length(x))))
like image 118
nograpes Avatar answered Oct 07 '22 19:10

nograpes