Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Equivalent of `break` in purrr::map

Tags:

r

dplyr

purrr

Say I want to run a loop until a condition is met, at which point the result is saved and the loop exits:

library(tidyverse)

for (i in 1:5) {

  df <- iris %>% select(i) %>% head(2)

  if (names(df) == "Petal.Width") {
    out <- df
    break 

  }
}

out

How can I rewrite this using purr::map without having to evaluate each i?

Doing the following gives the result I need, but has to evaluate 5 times, whereas the for loop only 3 times:

fun <- function(x) {

  df <- iris %>% select(x) %>% head(2)

  if (names(df) == "Petal.Width") {
  return(df)
  }
}

map_df(1:5, fun)
like image 396
Shinobi_Atobe Avatar asked Feb 12 '19 15:02

Shinobi_Atobe


People also ask

How do you use the map in purrr?

To map to a character vector, you can use the map_chr() (“map to a character”) function. If you want to return a data frame, then you would use the map_df() function. However, you need to make sure that in each iteration you're returning a data frame which has consistent column names.

How does the map function work in R?

The map functions transform their input by applying a function to each element of a list or atomic vector and returning an object of the same length as the input. map() always returns a list. See the modify() family for versions that return an object of the same type as the input.


2 Answers

There is no equivalent. In fact, one thing that makes map (and similar functions) so superior to general loops in terms of readability is that they have absolutely predicable behaviour: they will execute the function exactly once for each element, no exceptions (except, uh, if there’s an exception: you could raise a condition via stop to short-circuit execution, but this is very rarely advisable).

Instead, your case doesn’t call for map, it calls for something along the lines of purrr::keep or purrr::reduce.

Think of it this way: map, reduce, etc. are abstractions which correspond to specific special cases of the more general for loop. Their purpose is to make clear which special case is being handled. As a programmer, your task then becomes to find the right abstraction.

In your particular case I would probably completely rewrite the statement using dplyr so giving a “best” purrr solution is hard: the best solution is not to use purrr. That said, you could use purrr::detect as follows:

names(iris) %>%
    detect(`==`, 'Sepal.Width') %>%
    `[`(iris, .) %>%
    head(2)

Or

seq_along(iris) %>%
    detect(~ names(iris[.x]) == 'Sepal.Width') %>%
    `[`(iris, .) %>%
    head(2)

… but really, here’s dplyr for comparison:

iris %>%
    select(Sepal.Width) %>%
    head(2)
like image 69
Konrad Rudolph Avatar answered Sep 27 '22 19:09

Konrad Rudolph


1) callCC can be used to get this effect:

callCC(function(k) {
  fun2 <- function(x) {
    print(x) # just to show that x = 5 is never run
    df <- iris %>% select(x) %>% head(2)
    if (names(df) == "Petal.Width") k(df)
  }
  map_df(1:5, fun2)
})

giving:

[1] 1
[1] 2
[1] 3
[1] 4
  Petal.Width
1         0.2
2         0.2

1a) If it is important to use fun without change then try this instead:

callCC(function(k) map_df(1:5, ~ if (!is.null(df <- fun(.x))) k(df)))

2) purrr::reduce An alternative is to use reduce from purrr (or Reduce from base R):

f <- function(x, y) if (is.null(x)) fun(y) else x
reduce(1:5, f, .init = NULL)

This is not as good as (1) and (1a) from the viewpoint that it will still involve iterating over each element of 1:5 but will only invoke fun for 1:4. In contrast (1) and (1a) actually return after running fun or fun2 on 4.

like image 32
G. Grothendieck Avatar answered Sep 27 '22 20:09

G. Grothendieck