Say I want to run a loop until a condition is met, at which point the result is saved and the loop exits: <pre class="prettyprint"><code>library(tidyverse) for (i in 1:5) { df <- iris %>% select(i) %>% head(2) if (names(df) == "Petal.Width") { out <- df break } } out </code></pre> How can I rewrite this using <code>purr::map</code> without having to evaluate each i? Doing the following gives the result I need, but has to evaluate 5 times, whereas the for loop only 3 times: <pre class="prettyprint"><code>fun <- function(x) { df <- iris %>% select(x) %>% head(2) if (names(df) == "Petal.Width") { return(df) } } map_df(1:5, fun) </code></pre>

There is no equivalent. In fact, one thing that makes <code>map</code> (and similar functions) so superior to general loops in terms of readability is that they have absolutely predicable behaviour: they will execute the function exactly once for each element, no exceptions (except, uh, if there’s an exception: you could raise a condition via <code>stop</code> to short-circuit execution, but this is very rarely advisable). Instead, your case doesn’t call for <code>map</code>, it calls for something along the lines of <code>purrr::keep</code> or <code>purrr::reduce</code>. Think of it this way: <code>map</code>, <code>reduce</code>, etc. are abstractions which correspond to specific special cases of the more general <code>for</code> loop. Their purpose is to make clear which special case is being handled. As a programmer, your task then becomes to find the right abstraction. In your particular case I would probably completely rewrite the statement using dplyr so giving a “best” purrr solution is hard: the best solution is not to use purrr. That said, you could use <code>purrr::detect</code> as follows: <pre class="prettyprint"><code>names(iris) %>% detect(`==`, 'Sepal.Width') %>% `[`(iris, .) %>% head(2) </code></pre> Or <pre class="prettyprint"><code>seq_along(iris) %>% detect(~ names(iris[.x]) == 'Sepal.Width') %>% `[`(iris, .) %>% head(2) </code></pre> … but really, here’s dplyr for comparison: <pre class="prettyprint"><code>iris %>% select(Sepal.Width) %>% head(2) </code></pre>

1) <code>callCC</code> can be used to get this effect: <pre class="prettyprint"><code>callCC(function(k) { fun2 <- function(x) { print(x) # just to show that x = 5 is never run df <- iris %>% select(x) %>% head(2) if (names(df) == "Petal.Width") k(df) } map_df(1:5, fun2) }) </code></pre> giving: <pre class="prettyprint"><code>[1] 1 [1] 2 [1] 3 [1] 4 Petal.Width 1 0.2 2 0.2 </code></pre> 1a) If it is important to use <code>fun</code> without change then try this instead: <pre class="prettyprint"><code>callCC(function(k) map_df(1:5, ~ if (!is.null(df <- fun(.x))) k(df))) </code></pre> 2) purrr::reduce An alternative is to use <code>reduce</code> from purrr (or <code>Reduce</code> from base R): <pre class="prettyprint"><code>f <- function(x, y) if (is.null(x)) fun(y) else x reduce(1:5, f, .init = NULL) </code></pre> This is not as good as (1) and (1a) from the viewpoint that it will still involve iterating over each element of 1:5 but will only invoke <code>fun</code> for 1:4. In contrast (1) and (1a) actually return after running <code>fun</code> or <code>fun2</code> on 4.

Equivalent of `break` in purrr::map

Tags:

r

dplyr

purrr

Say I want to run a loop until a condition is met, at which point the result is saved and the loop exits:

library(tidyverse)

for (i in 1:5) {

  df <- iris %>% select(i) %>% head(2)

  if (names(df) == "Petal.Width") {
    out <- df
    break 

  }
}

out

How can I rewrite this using purr::map without having to evaluate each i?

Doing the following gives the result I need, but has to evaluate 5 times, whereas the for loop only 3 times:

fun <- function(x) {

  df <- iris %>% select(x) %>% head(2)

  if (names(df) == "Petal.Width") {
  return(df)
  }
}

map_df(1:5, fun)

396

asked Feb 12 '19 15:02

Shinobi_Atobe

2 Answers

There is no equivalent. In fact, one thing that makes map (and similar functions) so superior to general loops in terms of readability is that they have absolutely predicable behaviour: they will execute the function exactly once for each element, no exceptions (except, uh, if there’s an exception: you could raise a condition via stop to short-circuit execution, but this is very rarely advisable).

Instead, your case doesn’t call for map, it calls for something along the lines of purrr::keep or purrr::reduce.

Think of it this way: map, reduce, etc. are abstractions which correspond to specific special cases of the more general for loop. Their purpose is to make clear which special case is being handled. As a programmer, your task then becomes to find the right abstraction.

In your particular case I would probably completely rewrite the statement using dplyr so giving a “best” purrr solution is hard: the best solution is not to use purrr. That said, you could use purrr::detect as follows:

names(iris) %>%
    detect(`==`, 'Sepal.Width') %>%
    `[`(iris, .) %>%
    head(2)

seq_along(iris) %>%
    detect(~ names(iris[.x]) == 'Sepal.Width') %>%
    `[`(iris, .) %>%
    head(2)

… but really, here’s dplyr for comparison:

iris %>%
    select(Sepal.Width) %>%
    head(2)

answered Sep 27 '22 19:09

Konrad Rudolph

1) callCC can be used to get this effect:

callCC(function(k) {
  fun2 <- function(x) {
    print(x) # just to show that x = 5 is never run
    df <- iris %>% select(x) %>% head(2)
    if (names(df) == "Petal.Width") k(df)
  }
  map_df(1:5, fun2)
})

giving:

[1] 1
[1] 2
[1] 3
[1] 4
  Petal.Width
1         0.2
2         0.2

1a) If it is important to use fun without change then try this instead:

callCC(function(k) map_df(1:5, ~ if (!is.null(df <- fun(.x))) k(df)))

2) purrr::reduce An alternative is to use reduce from purrr (or Reduce from base R):

f <- function(x, y) if (is.null(x)) fun(y) else x
reduce(1:5, f, .init = NULL)

This is not as good as (1) and (1a) from the viewpoint that it will still involve iterating over each element of 1:5 but will only invoke fun for 1:4. In contrast (1) and (1a) actually return after running fun or fun2 on 4.

answered Sep 27 '22 20:09

G. Grothendieck

Related questions
                            
                                R Data.Table Join on Conditionals
                            
                                Dynamically formatting individual axis labels in ggplot2
                            
                                Name list elements based on variable names R
                            
                                How to use eqnarray in R markdown for both html and pdf output?
                            
                                Installation of R-package "BH" not possible
                            
                                Read csv file in R with double quotes
                            
                                Crosstabs with data.table in R [duplicate]
                            
                                Dependency package "package_name" not available
                            
                                Add ylab to ggplot with fivethirtyeight ggtheme
                            
                                dynamic ggplot layers in shiny with nearPoints()
                            
                                Principal component analysis (PCA) of time series data: spatial and temporal pattern
                            
                                Why does is.na() change its argument?
                            
                                How to suppress automatic figure numbering in Rmarkdown / pandoc
                            
                                How to filter on partial match using sparklyr
                            
                                How to specify the size of a graph in ggplot2 independent of axis labels
                            
                                Change color of error messages in RMarkdown code output (HTML, PDF)
                            
                                Pipe operator %>% error with seq() function in R
                            
                                dplyr: Use a custom function in summarize() after group_by()
                            
                                in R dplyr why do I need to ungroup() after I count()?
                            
                                RStudio not finding RTools

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With