So I want to apply a function over a matrix in R. This works really intuitively for simple functions:
> (function(x)x*x)(matrix(1:10, nrow=2)) [,1] [,2] [,3] [,4] [,5] [1,] 1 9 25 49 81 [2,] 4 16 36 64 100
...but clearly I don't understand all of its workings:
> m = (matrix(1:10, nrow=2)) > (function(x) if (x %% 3 == 0) { return(NA) } else { return(x+1) })(m) [,1] [,2] [,3] [,4] [,5] [1,] 2 4 6 8 10 [2,] 3 5 7 9 11 Warning message: In if (x == 3) { : the condition has length > 1 and only the first element will be used
I read up on this and found out about Vectorize and sapply, which both seemed great and just like what I wanted, except that both of them convert my matrix into a list:
> y = (function(x) if (x %% 3 == 0) { return(NA) } else { return(x+1) }) > sapply(m, y) [1] 2 3 NA 5 6 NA 8 9 NA 11 > Vectorize(y)(m) [1] 2 3 NA 5 6 NA 8 9 NA 11
...whereas I'd like to keep it in a matrix with its current dimensions. How might I do this? Thanks!
One of the most famous and most used features of R is the *apply() family of functions, such as apply() , tapply() , and lapply() . Here, we'll look at apply() , which instructs R to call a user-specified function on each of the rows or each of the columns of a matrix.
We can apply a function to each element of a Matrix, or only to specific dimensions, using apply().
matrix() function in R Programming Language is used to convert an object into a Matrix.
To create a matrix in R you need to use the function called matrix(). The arguments to this matrix() are the set of elements in the vector. You have to pass how many numbers of rows and how many numbers of columns you want to have in your matrix. Note: By default, matrices are in column-wise order.
@Joshua Ulrich (and Dason) has a great answer. And doing it directly without the function y
is the best solution. But if you really need to call a function, you can make it faster using vapply
. It produces a vector without dimensions (as sapply
, but faster), but then you can add them back using structure
:
# Your function (optimized) y = function(x) if (x %% 3) x+1 else NA m <- matrix(1:1e6,1e3) system.time( r1 <- apply(m,1:2,y) ) # 4.89 secs system.time( r2 <- structure(sapply(m, y), dim=dim(m)) ) # 2.89 secs system.time( r3 <- structure(vapply(m, y, numeric(1)), dim=dim(m)) ) # 1.66 secs identical(r1, r2) # TRUE identical(r1, r3) # TRUE
...As you can see, the vapply
approach is about 3x faster than apply
... And the reason vapply
is faster than sapply
is that sapply
must analyse the result to figure out that it can be simplified to a numeric vector. With vapply
, you specified the result type (numeric(1)
), so it doesn't have to guess...
UPDATE I figured out another (shorter) way of preserving the matrix structure:
m <- matrix(1:10, nrow=2) m[] <- vapply(m, y, numeric(1))
You simply assign the new values to the object using m[] <-
. Then all other attributes are preserved (like dim
, dimnames
, class
etc).
One way is to use apply
on both rows and columns:
apply(m,1:2,y) [,1] [,2] [,3] [,4] [,5] [1,] 2 NA 6 8 NA [2,] 3 5 NA 9 11
You can also do it with subscripting because ==
is already vectorized:
m[m %% 3 == 0] <- NA m <- m+1 m [,1] [,2] [,3] [,4] [,5] [1,] 2 NA 6 8 NA [2,] 3 5 NA 9 11
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With