Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

`purrr::map` to any type

Tags:

r

purrr

tidyverse

Is there a way to map to any type with purrr::map

library(tidyverse)
library(lubridate)

df <- data_frame(id = c(1, 1, 1, 2, 2, 2), 
                 val = c(1, 2, 3, 1, 2, 3), 
                 date = ymd("2017-01-01") + days(1:6))

df1 <- df %>% nest(-id) %>% 
  mutate(first_val = map_dbl(data, ~ .$val[1]), 
         first_day = map(data, ~ .$date[1]))

I would like first_day to be a column of type <date> as in df. I have tried flatten, but this does not work as it coerces the column to numeric.

like image 498
johannes Avatar asked Apr 08 '17 09:04

johannes


People also ask

What does the map function from purrr do?

The map functions transform their input by applying a function to each element of a list or atomic vector and returning an object of the same length as the input. map() always returns a list. See the modify() family for versions that return an object of the same type as the input.

Is Lapply faster than map?

the output should be the same and the benchmarks I made seem to show that lapply is slightly faster (it should be as map needs to evaluate all the non-standard-evaluation input).

Is purrr part of the Tidyverse?

The tidyverse is a coherent collection of packages in R for data science (and tidyverse is itself a actually package that loads all its constituent packages). Packages include: Data wrangling: dplyr , tidyr , readr. Iteration: purrr.


2 Answers

An alternative to the map_dbl() %>% as_date() is to use unnest() on the output column of interest:

library(tidyverse)
library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#> 
#>     date

df <- data_frame(id = c(1, 1, 1, 2, 2, 2), 
                 val = c(1, 2, 3, 1, 2, 3), 
                 date = ymd("2017-01-01") + days(1:6))

df %>% nest(-id) %>% 
  mutate(first_val = map_dbl(data, ~ .$val[1]), 
         first_day = map(data, ~ .$date[1])) %>% 
  unnest(first_day)
#> # A tibble: 2 x 4
#>      id data             first_val first_day 
#>   <dbl> <list>               <dbl> <date>    
#> 1     1 <tibble [3 × 2]>         1 2017-01-02
#> 2     2 <tibble [3 × 2]>         1 2017-01-05

Created on 2018-11-17 by the reprex package (v0.2.1)

like image 57
Matifou Avatar answered Sep 29 '22 13:09

Matifou


purrr is type-stable and this takes some getting used to.

In this case, it returns a list where you expect a <date>.

A simple and "stable" solution to you case would be to replace the second map with a map_dbl and have the output turned back to a <date> object using lubridate's as_date, like this:

df3 <- df %>% nest(-id) %>% 
   mutate(first_val = map_dbl(data, ~ .$val[1]), 
          first_day = as_date(map_dbl(data, ~ .$date[1])))

You get:

# A tibble: 2 × 4
  id             data                 first_val  first_day
 <dbl>          <list>                  <dbl>     <date>
 1              <tibble [3 × 2]>         1      2017-01-02
 2              <tibble [3 × 2]>         1      2017-01-05

Which is what you wanted (for this example).

EDIT: for any other types (other than <date>) you would have to find a different solution, however, the standard types are covered by the dedicated map_lgl, map_dbl, map_chr, etc.

like image 44
Adi Sarid Avatar answered Sep 29 '22 13:09

Adi Sarid