Using lists for simulation

Question

I set myself a little challenge on my way to learning R. The question was, given a sample of 500 numbers in normal distribution with mean 20, how many numbers under 20 would I get for standard deviations from 6 to 10. Just to have to learn more I decided to get 4 samples for each sd. So by the end I should have:

sd6samp1:...

sd6samp2:...

....

sd10samp4:...

My first approach, which worked was:

 ddss<-c(6:10) # sd's
 sam<-c(1:4) # 4 samples for each
 k=0  # counter in 0
 for (i in ddss) {   # for each sd
   for (j in sam) {  # for each sample
     nam <- paste("sam",i,".",j, sep="") # building a name
     n <- assign(nam,rnorm(500, 20, i))  # the great assign function
     k <- k+sum(n<=0)
   }
   print(assign(paste("ds",i,sep=""), k)) # ohh assign you're great
   k=0 # reset counter
 }

While looking for how to create variable names with the looping 'i', founded that 'assign' does the work but it also said:

Note though that if you are planning some simulations, many guRus would say that you should use a list.

So I thoght it would be good to learn lists...

In the meanwhile I also discover a great other option... ddss <- c(6:10)

for (i in ddss) {
   print(paste('prob. x<=0), with sd=',i))
   print(pnorm(0,mean=20,sd=i)*500)
}

This worked to answer the question, but the lists were still to be done... and a lot of R has yet to be learned. The main idea wasn't to know the very prob or number of negatives... but to learn R and specifically some looping.

So, I've been trying to go with the mentioned lists

My closest approach has been:

ddss<-c(6:10) # sd's to be calculated.
sam<-c(1:4) # 4 samples for each sd
liss<-list()  # initializing the list
for (i in ddss) {   # for each sd
   liss[[i]] <- list()
   for (j in sam) {  # for each sample
      liss[[i]][[j]] <- rnorm(500, 20, i)
      print(paste('ds',i,'samp',j,'=',sum(liss[[i]][[j]]<0)))
   }
}

With this one I get the information but I'm wondering about two issues (1 & 2) and some other questions (3 & 4):

I get a list of 10 elements, 6 empty ones and then 4 with sublists. I can't seem to find out how to work with elements 1:4 of the list (sd's) with the 6:9 names (the very sd's).
Even though I tried, I couldn't get to name the lists elements through the 'for' loops. Any insight on these issues would be great.
Since in this context of simulations. What do you think is better: nested lists (lists with sublists) or simple (longer) lists?
I wondered whether the 'apply' functions would be of any help here, I tried to do something, like:

vbv<-matrix(c(6,6,6,6,7,7,7,7,8,8,8,8,9,9,9,9))
lsl<-apply(vbv, 2, function(x) rnorm(500,20,x))

But it looks I'm not getting even close....

Nick Sabbe · Accepted Answer

The problem is in your indexes: you are running over indexer i from ddss, which runs from 6 to 10. So in the first tour of duty in your outer loop, your first statement really says: liss[[6]]<-list(), implying that the first 5 ones are NULL.

So if you insist on working with loops, this is what you should do (check ?seq_along):

ddss<-c(6:10) # sd's to be calculated.
sam<-c(1:4) # 4 samples for each sd
liss<-list()  # initializing the list
for (i in seq_along(ddss)) {   # now, i runs from 1 to 5
   liss[[i]] <- list()
   for (j in sam) {  # for each sample
      liss[[i]][[j]] <- rnorm(500, 20, i)
      print(paste('ds',ddss[i],'samp',j,'=',sum(liss[[i]][[j]]<0)))
   }
   names(liss[[i]])<-as.character(sam)#this should solve your naming issue (1/2)
}
names(liss)<-as.character(ddss)#this should solve your naming issue (2/2)

Note that, as always, it is a good idea to name your variables something more useful than i or j: if you'd named it curds, maybe you wouldn't have used it immediately as an indexer in a list?

Now, if you are really aiming for improvement (but want to stick to lists), you indeed want to go with the apply style functions:

liss<-lapply(ddss, function(curds){ #apply the inline function to each ds and store results in a list
  return(lapply(sam, function(cursam){ #apply inline function to each sam and store results in a list
    rv<-rnorm(500, 20, curds)
    cat('ds',curds,'samp',cursam,'=',sum(rv<0), "
") #maybe better for your purposes.
    return(rv)
  }))
})

Finally, for your case, there is not a lot of reason to actually use lists (nor do you even need to keep the sampled data for each ds/sam): you can store everything as a threedimensional array, but since you specify it as a learning exercise (hey, maybe the array thing can be your next exercise :-)), I'll leave it at that.

Using lists for simulation

Tags:

random

r

nested-lists

simulation

fioghual

1 Answers

Nick Sabbe

Recent Activity

Donate For Us

Using lists for simulation

Tags:

random

r

nested-lists

simulation

fioghual

1 Answers

Nick Sabbe

Related questions

Recent Activity

Donate For Us