Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Purrr::map_df() drops NULL rows

Tags:

r

purrr

tidyverse

When using purrr::map_df(), I will occasionally pass in a list of data frames where some items are NULL. When I do, map_df() returns a data frame with fewer rows than the the original list.

I assume what's going on is that map_df() calls dplyr::bind_rows() which ignores NULL values. However, I'm not sure how to identify my problematic rows.

Here's an example:

library(purrr)

problemlist  <- list(NULL, NULL, structure(list(bounds = structure(list(northeast = structure(list(
    lat = 41.49, lng = -71.46), .Names = c("lat", "lng"
), class = "data.frame", row.names = 1L), southwest = structure(list(
    lat = 41.49, lng = -71.46), .Names = c("lat", "lng"
), class = "data.frame", row.names = 1L)), .Names = c("northeast", 
"southwest"), class = "data.frame", row.names = 1L), location = structure(list(
    lat = 41.49, lng = -71.46), .Names = c("lat", "lng"
), class = "data.frame", row.names = 1L), location_type = "ROOFTOP", 
    viewport = structure(list(northeast = structure(list(lat = 41.49, 
        lng = -71.46), .Names = c("lat", "lng"), class = "data.frame", row.names = 1L), 
        southwest = structure(list(lat = 41.49, lng = -71.46), .Names = c("lat", 
        "lng"), class = "data.frame", row.names = 1L)), .Names = c("northeast", 
    "southwest"), class = "data.frame", row.names = 1L)), .Names = c("bounds", 
"location", "location_type", "viewport"), class = "data.frame", row.names = 1L))

# what actually happens
map_df(problemlist, 'location')

#     lat    lng
# 1 41.49 -71.46


# desired result
map_df_with_Null_handling(problemlist, 'location') 

#     lat    lng
# 1    NA     NA
# 2    NA     NA
# 3 41.49 -71.46

I considered wrapping my location accessor in one of purrr's error handling functions (eg. safely() or possibly()), but it's not that I'm running into errors--I'm just not getting the desired results.

What's the best way to handle NULL values with map_df()?

like image 882
crazybilly Avatar asked Jan 24 '18 17:01

crazybilly


1 Answers

You can use the (as-of-present undocumented) .null argument for any of the map*() functions to tell the function what to do when it encounters a NULL value:

map_df(problemlist, 'location', .null = data_frame(lat = NA, lng = NA) )

#     lat    lng
# 1    NA     NA
# 2    NA     NA
# 3 41.49 -71.46
like image 167
crazybilly Avatar answered Oct 18 '22 05:10

crazybilly