Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Divide each each cell of large matrix by sum of its row

I have a site by species matrix. The dimensions are 375 x 360. Each value represents the frequency of a species in samples of that site.

I am trying to convert this matrix from frequencies to relative abundances at each site.

I've tried a few ways to achieve this and the only one that has worked is using a for loop. However, this takes an incredibly long time or simply never finishes.

Is there a function or a vectorised method of achieving this? I've included my for-loop as an example of what I am trying to do.

relative_abundance <- matrix(0, nrow= nrow(data_wide),
ncol=ncol(data), dimnames = dimnames(data))

i=0
j=0

for(i in 1:nrow(relative_abundance)){
  for(j in 1:ncol(relative_abundance)){
    species_freq <- data[i,j]
    row_sum <- sum(data[i,])
    relative_abundance[i,j] <- species_freq/row_sum
 }
}
like image 794
Zane.Lazare Avatar asked Feb 29 '16 02:02

Zane.Lazare


1 Answers

You could do this using apply, but scale in this case makes things even simplier. Assuming you want to divide columns by their sums:

set.seed(0)
relative_abundance <- matrix(sample(1:10, 360*375, TRUE), nrow= 375)

freqs <- scale(relative_abundance, center = FALSE, 
               scale = colSums(relative_abundance))

The matrix is too big to output here, but here's how it shoud look like:

> head(freqs[, 1:5])
            [,1]         [,2]        [,3]        [,4]         [,5]
[1,] 0.004409603 0.0014231499 0.003439803 0.004052685 0.0024026910
[2,] 0.001469868 0.0023719165 0.002457002 0.005065856 0.0004805382
[3,] 0.001959824 0.0018975332 0.004914005 0.001519757 0.0043248438
[4,] 0.002939735 0.0042694497 0.002948403 0.002532928 0.0009610764
[5,] 0.004899559 0.0009487666 0.000982801 0.001519757 0.0028832292
[6,] 0.001469868 0.0023719165 0.002457002 0.002026342 0.0009610764

And a sanity check:

> head(colSums(freqs))
[1] 1 1 1 1 1 1

Using apply:

freqs2 <- apply(relative_abundance, 2, function(i) i/sum(i))

This has the advatange of being easly changed to run by rows, but the results will be joined as columns anyway, so you'd have to transpose it.

like image 168
Molx Avatar answered Oct 23 '22 15:10

Molx