Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to define multiple variables with lapply?

I want to apply a function with multiple variables with different values to a list. I know how to do this with one changing variable

sapply(c(1:10), function(x) x * 2)
# [1]  2  4  6  8 10 12 14 16 18 20

but not with two. I show you first manually what I want (actually I use lapply() but sapply() is more synoptic in SO):

# manual
a <- sapply(c(1:10), function(x, y=2) x * y)
b <- sapply(c(1:10), function(x, y=3) x * y)
c <- sapply(c(1:10), function(x, y=4) x * y)
c(a, b, c)
# [1]  2  4  6  8 10 12 14 16 18 20  3  6  9 12 15 18 21 24 27 30  4  8 12 
# [24]  16 20 24 28 32 36 40

And this is my attempt where I try to define both x and y.

# attempt
X <- list(x = 1:10, y = 2:4)
sapply(c(1:10, 2:4), function(x, y) x * y)
# Error in FUN(X[[i]], ...) : argument "y" is missing, with no default

Benchmark of solutions

library(microbenchmark)
microbenchmark(sapply = as.vector(sapply(1:10, function(x, y) x * y, 2:4)), 
               mapply = mapply( FUN = function(x, y) x * y, 1:10, rep( x = 2:4, each = 10)),
               sapply2 = as.vector(sapply(1:10, function(y) sapply(2:4, function(x) x * y))),
               outer = c(outer(1:10, 2:4, function(x, y) x * y)))
# Unit: microseconds
# expr        min       lq      mean   median       uq      max neval
# sapply   34.212  36.3500  62.44864  39.1295  41.9090 2304.542   100
# mapply   62.008  65.8570  87.82891  70.3470  76.5480 1283.342   100
# sapply2 196.714 203.9835 262.09990 223.6550 232.2080 3344.129   100
# outer     7.698  10.4775  13.02223  12.4020  13.4715   53.883   100
like image 758
jay.sf Avatar asked Feb 03 '18 14:02

jay.sf


People also ask

How does Lapply work in R?

The lapply() function in the R Language takes a list, vector, or data frame as input and gives output in the form of a list object. Since the lapply() function applies a certain operation to all the elements of the list it doesn't need a MARGIN. Parameters: x: determines the input vector or an object.

What is Lapply and Sapply?

lapply() function displays the output as a list whereas sapply() function displays the output as a vector. lapply() and sapply() functions are used to perform some operations in a list of objects.

What is Mapply in R?

The mapply() function in R is used to apply a function FUN to a given list or a vector argument passed to it.


3 Answers

General solution

Try outer:

c(outer(1:10, 2:4, Vectorize(function(x, y) x*y)))
##  [1]  2  4  6  8 10 12 14 16 18 20  3  6  9 12 15 18 21 24 27 30  4  8 12 16 20
## [26] 24 28 32 36 40

If function is Vectorized already

If the function is already vectorized, as it is here, then we can omit Vectorize:

c(outer(1:10, 2:4, function(x, y) x * y))
##  [1]  2  4  6  8 10 12 14 16 18 20  3  6  9 12 15 18 21 24 27 30  4  8 12 16 20
## [26] 24 28 32 36 40

Particular example shown in question

In fact, in this particular case the anonymous function shown is the default so this would work:

c(outer(1:10, 2:4))
##  [1]  2  4  6  8 10 12 14 16 18 20  3  6  9 12 15 18 21 24 27 30  4  8 12 16 20
## [26] 24 28 32 36 40

Also in this particular case we could use:

c(1:10 %o% 2:4)
##  [1]  2  4  6  8 10 12 14 16 18 20  3  6  9 12 15 18 21 24 27 30  4  8 12 16 20
## [26] 24 28 32 36 40

If input is list X

If your starting point is list X shown in the question then:

c(outer(X[[1]], X[[2]], Vectorize(function(x, y) x * y)))
##  [1]  2  4  6  8 10 12 14 16 18 20  3  6  9 12 15 18 21 24 27 30  4  8 12 16 20
## [26] 24 28 32 36 40

or

c(do.call("outer", c(unname(X), Vectorize(function(x, y) x*y))))
##  [1]  2  4  6  8 10 12 14 16 18 20  3  6  9 12 15 18 21 24 27 30  4  8 12 16 20
## [26] 24 28 32 36 40

where the prior sections apply to shorten it, if applicable.

like image 84
G. Grothendieck Avatar answered Nov 03 '22 22:11

G. Grothendieck


Use mapply()

mapply() applies a function to multiple list or vector arguments.

rep() was also used to repeat the values 2, 3, and 4. Specifying 10 in the each parameter, rep() repeats each element of x 10 times.

This is necessary since the first argument in mapply() - 1:10 - is of length 10.

# supply the function first, followed by the
# arguments in the order in which they are called in `FUN`
mapply( FUN = function(x, y) x * y
        , 1:10
        , rep( x = 2:4, each = 10)
)

# [1]   2  4  6  8 10 12 14 16 18 20  3  6  9 12 15 18 21 24 27 30  4  8 12 16 20
# [26] 24 28 32 36 40
like image 31
Cristian E. Nuno Avatar answered Nov 03 '22 23:11

Cristian E. Nuno


First of all, you can do this just with lapply() if you your function is vectorized. In this case, it is :

x <- 1:10
unlist(lapply(2:4, function(y) x*y))
# OR
unlist(lapply(2:4, function(x=x,y) x*y))

Second, if you need to apply a function on every combination of two vectors, use outer() :

xf <- 1:10
yf <- 2:4
c(xf %o% yf)
# OR spelled out for any function:
c(outer(xf,yf,FUN = `*`))

If you use mapply, you can use the argument MoreArgs to avoid having to use rep to construct your arguments :

xf <- 1:10
yf <- 2:4
mapply(function(x,y) x*y,
       y = yf,
       MoreArgs = list(x = xf))

That is the exact equivalent of the lapply() construct I've shown above. The resulting matrix can also be turned into a vector using SIMPLIFY = FALSE and unlist() :

unlist(mapply(function(x,y) x*y,
              y = yf,
              MoreArgs = list(x = xf),
              SIMPLIFY = FALSE))

Which solution is the most convenient, depend on your actual use case. Timing-wise they are all comparable, in recent R versions probably outer() will be a tad slower than the other solutions.

Benchmarking

To show how results can vastly differ depending on the size and order of objects, I include the following benchmarking results (code and output below). This shows that :

  1. outer() is not necessarily the fastest solution, although it often is one of the fastest.
  2. manually repeating one vector in mapply() is adding so much overhead that even a double sapply() call is way faster.

The code: Warning: this will run for a while

fx <- sample(1e4)
fy <- sample(1e3)
library(microbenchmark)
microbenchmark(sapply = as.vector(sapply(fx, function(x, y) x * y, fy)), 
               mapply = mapply( FUN = function(x, y) x * y, fx, rep( fy, each = 1e4)),
               sapply2 = as.vector(sapply(fx, function(y) sapply(fy, function(x) x * y))),
               outer = c(outer(fx, fy, function(x, y) x * y)),
               mapply2 = mapply(function(x,y) x*y, x=fx, MoreArgs = list(y = fy)),
               mapply3 = mapply(function(x,y) x*y, y=fy, MoreArgs = list(x = fx)),
               times = 15)

The output on my machine :

Unit: milliseconds
    expr         min          lq       mean      median          uq        max neval cld
  sapply    89.52318    92.98653   344.1538   117.11280   239.64887  1485.3178    15 a  
  mapply 20471.02137 22925.42757 24478.5985 24650.29055 25627.31232 28840.3494    15   c
 sapply2  7472.02251  8268.04696  9519.8016  8707.19193  9528.46181 14182.7537    15  b 
   outer    77.62331    85.94651   189.5107    91.83722   182.08506  1119.6620    15 a  
 mapply2    77.76871    79.71924   143.9484    81.24168    84.53247   971.1792    15 a  
 mapply3    65.21709    71.85662   107.9586    73.80779   124.21141   242.0760    15 a  
like image 29
Joris Meys Avatar answered Nov 03 '22 21:11

Joris Meys