Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Losing names when using lists as results of dplyr::case_when

Tags:

r

dplyr

Please consider the following code:

library(dplyr)
x <- case_when(
   FALSE ~ list('a' = 'b'),
   TRUE  ~ list('c' = 'd')
)

x is

List of 1 $ NA: chr "d"

I expect the element d in x to have the name 'c' and not NA. Am I missing something? Is this a bug? And how can I achieve my expected behavior?

To be precise, I expect above statement to have the same result as

x <- list('c' = 'd')
like image 246
ACNB Avatar asked Nov 07 '22 14:11

ACNB


1 Answers

The following is irrelevant, please skip to the update

It's not really clear what your expected behavior is from the short code snippet and without sample data.

However it seems to me you have the wrong syntax on case_when.

The function works like this:

case_when( Condition 1 ~ Value to be assigned if true,
           Condition 2 ~ Value to be assigned if true

The conditions you use are FALSEand TRUE, this does not really make sense because the following will happen:

x <- case_when(
   FALSE ~ list('a' = 'b'), # FALSE is logically never True, so the value is never put in
   TRUE  ~ list('c' = 'd') # TRUE is always true, x will always be assigned the list
)

So first you have to rewrite your conditions to make sense. Secondly you are assigning lists as the return value and I do not think this is correct.

I assume you want to do this:

x <- case_when(
   VAR == 'a' ~ 'b', # If the variable to be evaluated has the value 'a' x will be 'b'
   VAR == 'c' ~ 'd') # If the variable to be evaluated has the value 'c' x will be 'd'
)

So this now evaluates an existing variable "VAR" and returns x as determined by your code. Notices that this statement is incomplete, because it will naturally return NAfor every case where none of the two conditions are met (so VAR is neither 'a' nor 'c').

So normally we complete like this:

x <- case_when(
   VAR == 'a' ~ 'b', # If the variable to be evaluated has the value 'a' x will be 'b'
   VAR == 'c' ~ 'd') # If the variable to be evaluated has the value 'c' x will be 'd'
   TRUE ~ 'Rest Value' # Assigns Rest value to x for all case that do not meet case 1 or 2
)

Update

It seems this is a known issue see here:

https://github.com/tidyverse/dplyr/issues/4194

Hadley gives the following solution as an alternative:

broken_function <- function(value) {
  if (value) {
    list(a = 1, b = 2)
  } else {
    lst(a = 2, b = 3)
  }
}
like image 94
Fnguyen Avatar answered Nov 15 '22 10:11

Fnguyen