So I want to apply a function over a matrix in R. This works really intuitively for simple functions: <pre class="prettyprint"><code>> (function(x)x*x)(matrix(1:10, nrow=2)) [,1] [,2] [,3] [,4] [,5] [1,] 1 9 25 49 81 [2,] 4 16 36 64 100 </code></pre> ...but clearly I don't understand all of its workings: <pre class="prettyprint"><code>> m = (matrix(1:10, nrow=2)) > (function(x) if (x %% 3 == 0) { return(NA) } else { return(x+1) })(m) [,1] [,2] [,3] [,4] [,5] [1,] 2 4 6 8 10 [2,] 3 5 7 9 11 Warning message: In if (x == 3) { : the condition has length > 1 and only the first element will be used </code></pre> I read up on this and found out about Vectorize and sapply, which both seemed great and just like what I wanted, except that both of them convert my matrix into a list: <pre class="prettyprint"><code>> y = (function(x) if (x %% 3 == 0) { return(NA) } else { return(x+1) }) > sapply(m, y) [1] 2 3 NA 5 6 NA 8 9 NA 11 > Vectorize(y)(m) [1] 2 3 NA 5 6 NA 8 9 NA 11 </code></pre> ...whereas I'd like to keep it in a matrix with its current dimensions. How might I do this? Thanks!

@Joshua Ulrich (and Dason) has a great answer. And doing it directly without the function <code>y</code> is the best solution. But if you really need to call a function, you can make it faster using <code>vapply</code>. It produces a vector without dimensions (as <code>sapply</code>, but faster), but then you can add them back using <code>structure</code>: <pre class="prettyprint"><code># Your function (optimized) y = function(x) if (x %% 3) x+1 else NA m <- matrix(1:1e6,1e3) system.time( r1 <- apply(m,1:2,y) ) # 4.89 secs system.time( r2 <- structure(sapply(m, y), dim=dim(m)) ) # 2.89 secs system.time( r3 <- structure(vapply(m, y, numeric(1)), dim=dim(m)) ) # 1.66 secs identical(r1, r2) # TRUE identical(r1, r3) # TRUE </code></pre> ...As you can see, the <code>vapply</code> approach is about 3x faster than <code>apply</code>... And the reason <code>vapply</code> is faster than <code>sapply</code> is that <code>sapply</code> must analyse the result to figure out that it can be simplified to a numeric vector. With <code>vapply</code>, you specified the result type (<code>numeric(1)</code>), so it doesn't have to guess... UPDATE I figured out another (shorter) way of preserving the matrix structure: <pre class="prettyprint"><code>m <- matrix(1:10, nrow=2) m[] <- vapply(m, y, numeric(1)) </code></pre> You simply assign the new values to the object using <code>m[] <-</code>. Then all other attributes are preserved (like <code>dim</code>, <code>dimnames</code>, <code>class</code> etc).

One way is to use <code>apply</code> on both rows and columns: <pre class="prettyprint"><code>apply(m,1:2,y) [,1] [,2] [,3] [,4] [,5] [1,] 2 NA 6 8 NA [2,] 3 5 NA 9 11 </code></pre> You can also do it with subscripting because <code>==</code> is already vectorized: <pre class="prettyprint"><code>m[m %% 3 == 0] <- NA m <- m+1 m [,1] [,2] [,3] [,4] [,5] [1,] 2 NA 6 8 NA [2,] 3 5 NA 9 11 </code></pre>

R: applying function over matrix and keeping matrix dimensions

Tags:

r

So I want to apply a function over a matrix in R. This works really intuitively for simple functions:

> (function(x)x*x)(matrix(1:10, nrow=2))  [,1] [,2] [,3] [,4] [,5] [1,]    1    9   25   49   81 [2,]    4   16   36   64  100

...but clearly I don't understand all of its workings:

> m = (matrix(1:10, nrow=2)) > (function(x) if (x %% 3 == 0) { return(NA) } else { return(x+1) })(m)      [,1] [,2] [,3] [,4] [,5] [1,]    2    4    6    8   10 [2,]    3    5    7    9   11 Warning message: In if (x == 3) { :   the condition has length > 1 and only the first element will be used

I read up on this and found out about Vectorize and sapply, which both seemed great and just like what I wanted, except that both of them convert my matrix into a list:

> y = (function(x) if (x %% 3 == 0) { return(NA) } else { return(x+1) }) > sapply(m, y)  [1]  2  3 NA  5  6 NA  8  9 NA 11 > Vectorize(y)(m)  [1]  2  3 NA  5  6 NA  8  9 NA 11

...whereas I'd like to keep it in a matrix with its current dimensions. How might I do this? Thanks!

696

asked Dec 20 '11 17:12

Paul Eastlund

2 Answers

@Joshua Ulrich (and Dason) has a great answer. And doing it directly without the function y is the best solution. But if you really need to call a function, you can make it faster using vapply. It produces a vector without dimensions (as sapply, but faster), but then you can add them back using structure:

# Your function (optimized) y = function(x) if (x %% 3) x+1 else NA  m <- matrix(1:1e6,1e3) system.time( r1 <- apply(m,1:2,y) ) # 4.89 secs system.time( r2 <- structure(sapply(m, y), dim=dim(m)) ) # 2.89 secs system.time( r3 <- structure(vapply(m, y, numeric(1)), dim=dim(m)) ) # 1.66 secs identical(r1, r2) # TRUE identical(r1, r3) # TRUE

...As you can see, the vapply approach is about 3x faster than apply... And the reason vapply is faster than sapply is that sapply must analyse the result to figure out that it can be simplified to a numeric vector. With vapply, you specified the result type (numeric(1)), so it doesn't have to guess...

UPDATE I figured out another (shorter) way of preserving the matrix structure:

m <- matrix(1:10, nrow=2) m[] <- vapply(m, y, numeric(1))

You simply assign the new values to the object using m[] <-. Then all other attributes are preserved (like dim, dimnames, class etc).

180

answered Sep 29 '22 21:09

Tommy

One way is to use apply on both rows and columns:

apply(m,1:2,y)      [,1] [,2] [,3] [,4] [,5] [1,]    2   NA    6    8   NA [2,]    3    5   NA    9   11

You can also do it with subscripting because == is already vectorized:

m[m %% 3 == 0] <- NA m <- m+1 m      [,1] [,2] [,3] [,4] [,5] [1,]    2   NA    6    8   NA [2,]    3    5   NA    9   11

answered Sep 29 '22 22:09

Joshua Ulrich

Related questions
                            
                                Using the %>% pipe, and dot (.) notation
                            
                                How do I find the edges of a vertex using igraph and R?
                            
                                Position legend in first plot of facet
                            
                                Make conditionalPanel depend on files uploaded with fileInput
                            
                                Use different center than the prime meridian in plotting a world map
                            
                                r cumsum per group in dplyr
                            
                                How to extend `==` behavior to vectors that include NAs?
                            
                                Naming list elements in R
                            
                                R: How can I install a specific release by install_github()?
                            
                                Running R Scripts with Plots
                            
                                How to combine row and column layout in flexdashboard?
                            
                                Adding text to a grid.table plot
                            
                                Cumulative count of each value [duplicate]
                            
                                How to develop a package in R?
                            
                                Adding labels to ggplot bar chart
                            
                                Creating a new column to a data frame using a formula from another variable
                            
                                Insert Layer underneath existing layers in ggplot2 object
                            
                                Using ggplot function in R error : could not find function ggplot
                            
                                Can't install rJava on ubuntu system
                            
                                Update a Value in One Column Based on Criteria in Other Columns

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With