How to filter out NULL elements of tibble's list column

Question

I've got a tibble like below:

structure(list(id = 1:11, var1 = c("A", "C", "B", "B", "B", "A", 
"B", "C", "C", "C", "B"), var2 = list(NULL, NULL, NULL, structure(list(
    x = c(0, 1, 23, 3), y = c(0.75149005651474, 0.149892757181078, 
    0.695984086720273, 0.0247649133671075)), row.names = c(NA, 
-4L), class = c("tbl_df", "tbl", "data.frame")), NULL, NULL, 
    NULL, NULL, NULL, NULL, NULL)), row.names = c(NA, -11L), class = c("tbl_df", 
"tbl", "data.frame"))

I'd like to leave only the rows where var2 is NOT null. But the simple !is.null() just doesn't work. df %>% filter(!is.null(var2)) returns the whole df. Why is that and how can I filter out all those rows with NULL in var2 column?

tmfmnk · Accepted Answer

One possibility also involving purrr could be:

df %>%
 filter(!map_lgl(var2, is.null))

     id var1  var2            
  <int> <chr> <list>          
1     4 B     <tibble [4 × 2]>

Reflecting the properties of is.null(), you can also do:

df %>%
 rowwise() %>%
 filter(!is.null(var2))

Adam · Answer

The function drop_na() from tidyr will also work for NULL. You just have to be careful for the edge case where you have both NULL and NA values and only wanted to drop the NULL for some reason.

Drop rows containing missing values

library(tidyr)

df %>% 
  drop_na(var2)

#        id var1  var2                
#     <int> <chr> <list>              
#   1     4 B     <tibble[,2] [4 x 2]>

Grada Gukovic · Answer

!is.null() doesnt work because your var2 is a nested list (list of lists). It contains a tibble as its fourth element. A tibble is a list beacuse it is a data.frame and is.null checks only the first level of the nested list.

#show that the tibble is a list:
> is.list(df$var2[[4]])
[1] TRUE

You can try filtering on lengths(df$var2) > 0

> lengths(df$var2)
 [1] 0 0 0 2 0 0 0 0 0 0 0  
# each of the columns of the tibble in var2[[4]] is one element 
# of the list contained in var2[[4]]. Thus var2[[4]] is a list of length two

> df %>% filter(lengths(var2) > 0)
# A tibble: 1 x 3
     id var1  var2            
  <int> <chr> <list>          
1     4 B     <tibble [4 x 2]>
>

How to filter out NULL elements of tibble's list column

Tags:

r

dplyr

jakes

3 Answers

tmfmnk

Adam

Grada Gukovic

Recent Activity

Donate For Us

How to filter out NULL elements of tibble's list column

Tags:

r

dplyr

jakes

3 Answers

tmfmnk

Adam

Grada Gukovic

Related questions

Recent Activity

Donate For Us