I would like to divide each row of a matrix by a fixed vector. For example <pre class="prettyprint"><code>mat<-matrix(1,ncol=2,nrow=2,TRUE) dev<-c(5,10) </code></pre> Giving <code>mat/dev</code> divides each column by <code>dev</code>. <pre class="prettyprint"><code> [,1] [,2] [1,] 0.2 0.2 [2,] 0.1 0.1 </code></pre> However, I would like to have this as a result, i.e. do the operation row-wise : <pre class="prettyprint"><code>rbind(mat[1,]/dev, mat[2,]/dev) [,1] [,2] [1,] 0.2 0.1 [2,] 0.2 0.1 </code></pre> Is there an explicit command to get there?

Here are a few ways in order of increasing code length: <pre class="prettyprint"><code>t(t(mat) / dev) mat / dev[col(mat)] # @DavidArenburg & @akrun mat %*% diag(1 / dev) sweep(mat, 2, dev, "/") t(apply(mat, 1, "/", dev)) plyr::aaply(mat, 1, "/", dev) mat / rep(dev, each = nrow(mat)) mat / t(replace(t(mat), TRUE, dev)) mapply("/", as.data.frame(mat), dev) # added later mat / matrix(dev, nrow(mat), ncol(mat), byrow = TRUE) # added later do.call(rbind, lapply(as.data.frame(t(mat)), "/", dev)) mat2 <- mat; for(i in seq_len(nrow(mat2))) mat2[i, ] <- mat2[i, ] / dev </code></pre> <h3>Data Frames</h3> All the solutions that begin with <code>mat /</code> also work if <code>mat</code> is a data frame and produce a data frame result. The same is also the case for the <code>sweep</code> solution and the last, i.e. <code>mat2</code>, solution. The <code>mapply</code> solutions works with data.frames but produces a matrix. <h3>Vector</h3> If <code>mat</code> is a plain vector rather than a matrix then either of these return a one column matrix <pre class="prettyprint"><code>t(t(mat) / dev) mat / t(replace(t(mat), TRUE, dev)) </code></pre> and this one returns a vector: <pre class="prettyprint"><code>plyr::aaply(mat, 1, "/", dev) </code></pre> The others give an error, warning or not the desired answer. <h3>Benchmarks</h3> The brevity and clarity of the code may be more important than speed but for purposes of completeness here are some benchmarks using 10 repetitions and then 100 repetitions. <pre class="prettyprint"><code>library(microbenchmark) library(plyr) set.seed(84789) mat<-matrix(runif(1e6),nrow=1e5) dev<-runif(10) microbenchmark(times=10L, "1" = t(t(mat) / dev), "2" = mat %*% diag(1/dev), "3" = sweep(mat, 2, dev, "/"), "4" = t(apply(mat, 1, "/", dev)), "5" = mat / rep(dev, each = nrow(mat)), "6" = mat / t(replace(t(mat), TRUE, dev)), "7" = aaply(mat, 1, "/", dev), "8" = do.call(rbind, lapply(as.data.frame(t(mat)), "/", dev)), "9" = {mat2 <- mat; for(i in seq_len(nrow(mat2))) mat2[i, ] <- mat2[i, ] / dev}, "10" = mat/dev[col(mat)]) </code></pre> giving: <pre class="prettyprint"><code>Unit: milliseconds expr min lq mean median uq max neval 1 7.957253 8.136799 44.13317 8.370418 8.597972 366.24246 10 2 4.678240 4.693771 10.11320 4.708153 4.720309 58.79537 10 3 15.594488 15.691104 16.38740 15.843637 16.559956 19.98246 10 4 96.616547 104.743737 124.94650 117.272493 134.852009 177.96882 10 5 17.631848 17.654821 18.98646 18.295586 20.120382 21.30338 10 6 19.097557 19.365944 27.78814 20.126037 43.322090 48.76881 10 7 8279.428898 8496.131747 8631.02530 8644.798642 8741.748155 9194.66980 10 8 509.528218 524.251103 570.81573 545.627522 568.929481 821.17562 10 9 161.240680 177.282664 188.30452 186.235811 193.250346 242.45495 10 10 7.713448 7.815545 11.86550 7.965811 8.807754 45.87518 10 </code></pre> Re-running the test on all those that took <20 milliseconds with 100 repetitions: <pre class="prettyprint"><code>microbenchmark(times=100L, "1" = t(t(mat) / dev), "2" = mat %*% diag(1/dev), "3" = sweep(mat, 2, dev, "/"), "5" = mat / rep(dev, each = nrow(mat)), "6" = mat / t(replace(t(mat), TRUE, dev)), "10" = mat/dev[col(mat)]) </code></pre> giving: <pre class="prettyprint"><code>Unit: milliseconds expr min lq mean median uq max neval 1 8.010749 8.188459 13.972445 8.560578 10.197650 299.80328 100 2 4.672902 4.734321 5.802965 4.769501 4.985402 20.89999 100 3 15.224121 15.428518 18.707554 15.836116 17.064866 42.54882 100 5 17.625347 17.678850 21.464804 17.847698 18.209404 303.27342 100 6 19.158946 19.361413 22.907115 19.772479 21.142961 38.77585 100 10 7.754911 7.939305 9.971388 8.010871 8.324860 25.65829 100 </code></pre> So on both these tests #2 (using <code>diag</code>) is fastest. The reason may lie in its almost direct appeal to the BLAS, whereas #1 relies on the costlier <code>t</code>.

How to divide each row of a matrix by elements of a vector in R

Tags:

r

vector

matrix

I would like to divide each row of a matrix by a fixed vector. For example

mat<-matrix(1,ncol=2,nrow=2,TRUE) dev<-c(5,10)

Giving mat/dev divides each column by dev.

     [,1] [,2] [1,]  0.2  0.2 [2,]  0.1  0.1

However, I would like to have this as a result, i.e. do the operation row-wise :

rbind(mat[1,]/dev, mat[2,]/dev)       [,1] [,2] [1,]  0.2  0.1 [2,]  0.2  0.1

Is there an explicit command to get there?

716

asked Dec 15 '13 15:12

tomka

1 Answers

Here are a few ways in order of increasing code length:

t(t(mat) / dev)  mat / dev[col(mat)] #  @DavidArenburg & @akrun  mat %*% diag(1 / dev)  sweep(mat, 2, dev, "/")  t(apply(mat, 1, "/", dev))  plyr::aaply(mat, 1, "/", dev)  mat / rep(dev, each = nrow(mat))  mat / t(replace(t(mat), TRUE, dev))  mapply("/", as.data.frame(mat), dev)  # added later  mat / matrix(dev, nrow(mat), ncol(mat), byrow = TRUE)  # added later  do.call(rbind, lapply(as.data.frame(t(mat)), "/", dev))  mat2 <- mat; for(i in seq_len(nrow(mat2))) mat2[i, ] <- mat2[i, ] / dev

Data Frames

All the solutions that begin with mat / also work if mat is a data frame and produce a data frame result. The same is also the case for the sweep solution and the last, i.e. mat2, solution. The mapply solutions works with data.frames but produces a matrix.

Vector

If mat is a plain vector rather than a matrix then either of these return a one column matrix

t(t(mat) / dev) mat / t(replace(t(mat), TRUE, dev))

and this one returns a vector:

plyr::aaply(mat, 1, "/", dev)

The others give an error, warning or not the desired answer.

Benchmarks

The brevity and clarity of the code may be more important than speed but for purposes of completeness here are some benchmarks using 10 repetitions and then 100 repetitions.

library(microbenchmark) library(plyr)  set.seed(84789)  mat<-matrix(runif(1e6),nrow=1e5) dev<-runif(10)  microbenchmark(times=10L,   "1" = t(t(mat) / dev),   "2" = mat %*% diag(1/dev),   "3" = sweep(mat, 2, dev, "/"),   "4" = t(apply(mat, 1, "/", dev)),   "5" = mat / rep(dev, each = nrow(mat)),   "6" = mat / t(replace(t(mat), TRUE, dev)),   "7" = aaply(mat, 1, "/", dev),   "8" = do.call(rbind, lapply(as.data.frame(t(mat)), "/", dev)),   "9" = {mat2 <- mat; for(i in seq_len(nrow(mat2))) mat2[i, ] <- mat2[i, ] / dev},  "10" = mat/dev[col(mat)])

giving:

Unit: milliseconds  expr         min          lq       mean      median          uq        max neval     1    7.957253    8.136799   44.13317    8.370418    8.597972  366.24246    10     2    4.678240    4.693771   10.11320    4.708153    4.720309   58.79537    10     3   15.594488   15.691104   16.38740   15.843637   16.559956   19.98246    10     4   96.616547  104.743737  124.94650  117.272493  134.852009  177.96882    10     5   17.631848   17.654821   18.98646   18.295586   20.120382   21.30338    10     6   19.097557   19.365944   27.78814   20.126037   43.322090   48.76881    10     7 8279.428898 8496.131747 8631.02530 8644.798642 8741.748155 9194.66980    10     8  509.528218  524.251103  570.81573  545.627522  568.929481  821.17562    10     9  161.240680  177.282664  188.30452  186.235811  193.250346  242.45495    10    10    7.713448    7.815545   11.86550    7.965811    8.807754   45.87518    10

Re-running the test on all those that took <20 milliseconds with 100 repetitions:

microbenchmark(times=100L,   "1" = t(t(mat) / dev),   "2" = mat %*% diag(1/dev),   "3" = sweep(mat, 2, dev, "/"),   "5" = mat / rep(dev, each = nrow(mat)),   "6" = mat / t(replace(t(mat), TRUE, dev)),  "10" = mat/dev[col(mat)])

giving:

Unit: milliseconds  expr       min        lq      mean    median        uq       max neval     1  8.010749  8.188459 13.972445  8.560578 10.197650 299.80328   100     2  4.672902  4.734321  5.802965  4.769501  4.985402  20.89999   100     3 15.224121 15.428518 18.707554 15.836116 17.064866  42.54882   100     5 17.625347 17.678850 21.464804 17.847698 18.209404 303.27342   100     6 19.158946 19.361413 22.907115 19.772479 21.142961  38.77585   100    10  7.754911  7.939305  9.971388  8.010871  8.324860  25.65829   100

So on both these tests #2 (using diag) is fastest. The reason may lie in its almost direct appeal to the BLAS, whereas #1 relies on the costlier t.

173

answered Sep 23 '22 04:09

G. Grothendieck

Related questions
                            
                                Proper/fastest way to reshape a data.table
                            
                                Draw a circle with ggplot2
                            
                                Creating multi column legend in ggplot
                            
                                Append lines to a file
                            
                                How to increase the number of columns using R in Linux
                            
                                How to use grep()/gsub() to find exact match
                            
                                Add a prefix to column names
                            
                                List all column except for one in R [duplicate]
                            
                                knitr/Rmd: page break after n lines/n distance
                            
                                Restart mixed effect model estimation with previously estimated values
                            
                                How to efficiently use Rprof in R?
                            
                                "%%" and "%/%" for the remainder and the quotient
                            
                                Plot size and resolution with R markdown, knitr, pandoc, beamer
                            
                                Comparing gather (tidyr) to melt (reshape2)
                            
                                Applying group_by and summarise on data while keeping all the columns' info
                            
                                ggplot2 bar plot, no space between bottom of geom and x axis keep space above
                            
                                Creating a Plot Window of a Particular Size
                            
                                dplyr::select function clashes with MASS::select
                            
                                Extract p-value from aov
                            
                                rbind error: "names do not match previous names"

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With