Say I want to run a loop until a condition is met, at which point the result is saved and the loop exits:
library(tidyverse)
for (i in 1:5) {
df <- iris %>% select(i) %>% head(2)
if (names(df) == "Petal.Width") {
out <- df
break
}
}
out
How can I rewrite this using purr::map
without having to evaluate each i?
Doing the following gives the result I need, but has to evaluate 5 times, whereas the for loop only 3 times:
fun <- function(x) {
df <- iris %>% select(x) %>% head(2)
if (names(df) == "Petal.Width") {
return(df)
}
}
map_df(1:5, fun)
To map to a character vector, you can use the map_chr() (“map to a character”) function. If you want to return a data frame, then you would use the map_df() function. However, you need to make sure that in each iteration you're returning a data frame which has consistent column names.
The map functions transform their input by applying a function to each element of a list or atomic vector and returning an object of the same length as the input. map() always returns a list. See the modify() family for versions that return an object of the same type as the input.
There is no equivalent. In fact, one thing that makes map
(and similar functions) so superior to general loops in terms of readability is that they have absolutely predicable behaviour: they will execute the function exactly once for each element, no exceptions (except, uh, if there’s an exception: you could raise a condition via stop
to short-circuit execution, but this is very rarely advisable).
Instead, your case doesn’t call for map
, it calls for something along the lines of purrr::keep
or purrr::reduce
.
Think of it this way: map
, reduce
, etc. are abstractions which correspond to specific special cases of the more general for
loop. Their purpose is to make clear which special case is being handled. As a programmer, your task then becomes to find the right abstraction.
In your particular case I would probably completely rewrite the statement using dplyr so giving a “best” purrr solution is hard: the best solution is not to use purrr. That said, you could use purrr::detect
as follows:
names(iris) %>%
detect(`==`, 'Sepal.Width') %>%
`[`(iris, .) %>%
head(2)
Or
seq_along(iris) %>%
detect(~ names(iris[.x]) == 'Sepal.Width') %>%
`[`(iris, .) %>%
head(2)
… but really, here’s dplyr for comparison:
iris %>%
select(Sepal.Width) %>%
head(2)
1) callCC
can be used to get this effect:
callCC(function(k) {
fun2 <- function(x) {
print(x) # just to show that x = 5 is never run
df <- iris %>% select(x) %>% head(2)
if (names(df) == "Petal.Width") k(df)
}
map_df(1:5, fun2)
})
giving:
[1] 1
[1] 2
[1] 3
[1] 4
Petal.Width
1 0.2
2 0.2
1a) If it is important to use fun
without change then try this instead:
callCC(function(k) map_df(1:5, ~ if (!is.null(df <- fun(.x))) k(df)))
2) purrr::reduce An alternative is to use reduce
from purrr (or Reduce
from base R):
f <- function(x, y) if (is.null(x)) fun(y) else x
reduce(1:5, f, .init = NULL)
This is not as good as (1) and (1a) from the viewpoint that it will still involve iterating over each element of 1:5 but will only invoke fun
for 1:4. In contrast (1) and (1a) actually return after running fun
or fun2
on 4.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With