Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

selecting columns specified by a random vector in R

I have a large matrix from which I would like to randomly extract a smaller matrix. (I want to do this 1000 times, so ultimately it will be in a for loop.) Say for example that I have this 9x9 matrix:

mat=matrix(c(0,0,1,0,1,0,0,0,1,0,0,0,0,1,1,1,0,0,1,0,1,0,0,0,0,0,1,0,1,0,0,0,1,
          0,0,0,0,1,1,1,0,0,1,0,1,0,0,0,0,0,1,0,1,0,0,0,1,0,0,0,0,1,1,1,0,0,
          1,0,1,0,0,0,0,0,1,0,1,0,0,0,1), nrow=9)

From this matrix, I would like a random 3x3 subset. The trick is that I do not want any of the row or column sums in the final matrix to be 0. Another important thing is that I need to know the original number of the rows and columns in the final matrix. So, if I end up randomly selecting rows 4, 5, and 7 and columns 1, 3, and 8, I want to have those identifiers easily accessible in the final matrix.

Here is what I've done so far.

First, I create a vector of row numbers and column numbers. I am trying to keep these attached to the matrix throughout.

r.num<-seq(from=1,to=nrow(mat),by=1)      #vector of row numbers
c.num<-seq(from=0, to=(ncol(mat)+1),by=1) #vector of col numbers (adj for r.num)

mat.1<-cbind(r.num,mat)
mat.2<-rbind(c.num,mat.1)

Now I have a 10x10 matrix with identifiers. I can select my rows by creating a random vector and subsetting the matrix.

rand <- sample(r.num,3)
temp1 <- rbind(mat.2[1,],mat.2[rand,])      #keep the identifier row

This works well! Now I want to randomly select 3 columns. This is where I am running into trouble. I tried doing it the same way.

rand2 <- sample(c.num,3)
temp2 <- cbind(temp1[,1],temp1[,rand2])

The problem is that I end up with some row and column sums that are 0. I can eliminate columns that sum to 0 first.

temp3 <- temp1[,which(colSums(temp1[2:nrow(temp1),])>0)]
cols <- which(colSums(temp1[2:nrow(temp1),2:ncol(temp1)])>0)
rand3 <- sample(cols,3)
temp4 <- cbind(temp3[,1],temp3[,rand3])

But I end up with an error message. For some reason, R does not like to subset the matrix this way.

So my question is, is there a better way to subset the matrix by the random vector "rand3" after the zero columns have been removed OR is there a better way to randomly select three complementary rows and columns such that there are none that sum to 0?

Thank you so much for your help!

like image 425
Laura Avatar asked Oct 09 '22 08:10

Laura


1 Answers

If I understood your problem, I think this would work:

mat=matrix(c(0,0,1,0,1,0,0,0,1,0,0,0,0,1,1,1,0,0,1,0,1,0,0,0,0,0,1,0,1,0,0,0,1,
          0,0,0,0,1,1,1,0,0,1,0,1,0,0,0,0,0,1,0,1,0,0,0,1,0,0,0,0,1,1,1,0,0,
          1,0,1,0,0,0,0,0,1,0,1,0,0,0,1), nrow=9)

smallmatrix = matrix(0,,nrow=3,ncol=3)

 while(any(apply(smallmatrix,2,sum) ==0) | any(apply(smallmatrix,1,sum) ==0)){
      cols = sample(ncol(mat),3)
      rows= sample(nrow(mat),3)
      smallmatrix = mat[rows,cols]
}

colnames(smallmatrix) = cols
rownames(smallmatrix) = rows
like image 134
aatrujillob Avatar answered Oct 18 '22 03:10

aatrujillob