Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pad each element in a list to specific length in R

Tags:

r

lapply

na

Here is a simple r question which basically pertains to correctly understanding list syntax I think. I have a series of matrices loaded into a list (following some preliminary calculations) which I then want to conduct some basic block averaging on. My basic workflow will be as follows:

1) Rounding each vector contained within a list to an integer corresponding to the number of blocks I am interested in averaging to.

2) Padding each vector in a list to this new length.

3) Conversion of each matrix in the list to a new matrix to which I will then apply colmeans ignoring NA's.

This very basic workflow follows the simple approach shown here for a vector: http://www.cookbook-r.com/Manipulating_data/Averaging_a_sequence_in_blocks/

However I have a list of vectors and not just a vector. For example for blocks of two:

test1 <- list(a=c(1,2,3,4), b=c(2,4,6,8,10), c=c(3,6))
# Round up the length of vector the to the nearest 2
newlength <-  lapply(test1, function(x) {ceiling(length(x)/2)*2})

Now to my problem. If these were matrices outside a list I would normally pad their length with NAs as follows:

test1[newlength] <- NA

But how to do this using lappy (or something akin- mapply?). I am obviously not thinking about the syntax correctly here:

lapply(test1, function(x) {x[newlength] <- NA})

This obviously returns the error:

Error in x[newlength] <- NA : invalid subscript type 'list'

since the syntax for a list is incorrect. So how should I do this correctly?

Just to finish the process in case there is an entirely better way of doing this at the end I would normally do the following to a vector:

# Convert to a matrix with 2 rows
test1 <- matrix(test1, nrow=2)
# Take the means of the columns, and ignore any NA's
colMeans(test1, na.rm=TRUE)

Would I be better leaving a list environment first? My reason for the list is that I have a large dataset and using a list seemed a more elegant approach. I am open to suggestions and more logical approaches however. Thanks.

like image 307
user1912925 Avatar asked Oct 26 '25 09:10

user1912925


2 Answers

It sounds like you want:

mapply(function(x,y) {
     # x[y] <- NA # OP's proposed strategy
     length(x) <- y # Roland's better suggestion
     return(x)
     }, test1, newlength)
like image 180
Thomas Avatar answered Oct 28 '25 00:10

Thomas


There are lots of ways to fix your problem, but I think there are two important improvements to make. The first is to do all this in a single call to lapply(). The other main problem you have is that there is no actual return() value from the function() in your call that returns the error (sorry, on a tablet, difficult to copy and paste). So you pad out "x" ok, but what do you tell function() to return? Nothing.

Here is one solution that does both these things, if I understand you correctly:

lapply(test1, function(x){
  newlength <- ceiling(length(x)/2)*2
  if(newlength!=length(x)){x[newlength] <- NA}
  colMeans(matrix(x, nrow=2), na.rm=TRUE)
})
like image 22
Peter Ellis Avatar answered Oct 28 '25 00:10

Peter Ellis



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!