Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Harnessing .f list names with purrr::pmap

Tags:

r

purrr

The following works ok:

pmap_dbl(iris, ~ ..1 + ..2 + ..3 + ..4)

The documentation for .l provides for A list of lists. ... List names will be used if present.. This suggests you should be able to work with the list names (i.e. column names). However:

pmap_dbl(iris, ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width)
Error in .f(Sepal.Length = .l[[c(1L, i)]], Sepal.Width = .l[[c(2L, i)]],  : 
  object 'Sepal.Length' not found

How are list names harnessed in practice?

like image 286
geotheory Avatar asked Jul 01 '18 11:07

geotheory


2 Answers

The formula argument ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width is passed to purrr::as_mapper.

purrr::as_mapper(~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width)
# function (..., .x = ..1, .y = ..2, . = ..1) 
# Sepal.Length + Sepal.Width + Petal.Length + Petal.Width

You can see that there's no direct way for this function to know what these variables are.

I can think of 3 ways around this. I'll use @zacdav's example as it's more compact and readable than yours:

named_list <- list(one = c(1, 1),
                   two = c(2, 2),
                   three = c(3, 3))

Explicit definition

You can define explicitly these variables as shown in @zacdav's answer it will work.


Explore the dots argument

There is a way to access the named arguments through the ... parameter of the function returned by as_mapper.

The arguments of the function are named when names are available, as the doc states in other words.

That explains why pmap(named_list, function(x,y,z) x+y+z) will fail with error:

unused arguments (one = .l[[c(1, i)]], two = .l[[c(2, i)]], three = .l[[c(3, i)]])

See:

pmap(named_list, ~names(list(...)))
# [[1]]
# [1] "one"   "two"   "three"
# 
# [[2]]
# [1] "one"   "two"   "three"

(pmap(unname(named_list), function(x,y,z) x+y+z) on the other hand will work fine)

So this will work:

pmap(named_list, ~ with(list(...), one + two + three))
# [[1]]
# [1] 6
# 
# [[2]]
# [1] 6 

Use pryr::f

pryr offers a neat shortcut for function definitions with pryr::f :

library(pryr)
f(one + two + three)
# function (one, three, two) 
# one + two + three

pmap(named_list, f(one + two + three))
# [[1]]
# [1] 6
# 
# [[2]]
# [1] 6
# 

Be careful however when using it, global variables will still show up as parameters and functions will or will not be included in parameters depending on how they're called. For example :

x <- 1
test <- mean
f(test(x) + lapply(iris,test2))
# function (iris, test2, x) 
# test(x) + lapply(iris, test2)

So it's not a general approach and you should use it only with simple cases. the second approach, though a bit of a hack, will be general.

Moreover f is ordering the parameters alphabetically, this should not be an issue when dealing with a named list, but be careful when dealing with partially named lists.

like image 86
Moody_Mudskipper Avatar answered Nov 01 '22 05:11

Moody_Mudskipper


library(purrr)
named_list <- list(one = c(1, 1),
                   two = c(2, 2),
                   three = c(3, 3))

pmap(named_list, function(one, two, three) one + two + three)

Or even in the pmap documentation:

# Matching arguments by name
l <- list(a = x, b = y, c = z)
pmap(l, function(c, b, a) a / (b + c))

This works because it expects to see each named element apparently.

pmap_dbl(iris, function(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species) Sepal.Length + Sepal.Width)

You can also make use of ... it seems:

pmap_dbl(iris, function(Sepal.Length, Sepal.Width, ...) Sepal.Length + Sepal.Width)

ideally this example would just use rowSums in practice though.

like image 21
zacdav Avatar answered Nov 01 '22 04:11

zacdav