Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

group_by and pmap a piecewise operation on each row per group (ifelse vs case_when)

Tags:

r

dplyr

I am trying to group_by a variable and then do operations per row per group. I got lost when using ifelse vs case_when. There is something basic I am failing to understand between the usage of two. I was assuming both would give me same output but that is not the case here. Using ifelse didn't give the expected output but case_when did. And I am trying to understand why ifelse didn't give me the expected output.

Here is the example df

structure(list(Pos = c(73L, 146L, 146L, 150L, 150L, 151L, 151L, 
152L, 182L, 182L), Percentage = c(81.2, 13.5, 86.4, 66.1, 33.9, 
48.1, 51.9, 86.1, 48, 52)), row.names = c(NA, -10L), class = c("tbl_df", 
"tbl", "data.frame")) -> foo

I am grouping by Pos and I want to round Percentage if their sum is 100. The following is using ifelse:

library(tidyverse)
foo %>% 
     group_by(Pos) %>%
     mutate(sumn = n()) %>% 
     mutate(Val = ifelse(sumn == 1,100,
                         ifelse(sum(Percentage) == 100, unlist(map(Percentage,round)), 0)
                         # case_when(sum(Percentage) == 100 ~ unlist(map(Percentage,round)),
                         #           TRUE ~ 0
                         # )
                         ))

the output is

# A tibble: 10 x 4
# Groups:   Pos [6]
     Pos Percentage  sumn   Val
   <int>      <dbl> <int> <dbl>
 1    73       81.2     1   100
 2   146       13.5     2     0
 3   146       86.4     2     0
 4   150       66.1     2    66
 5   150       33.9     2    66
 6   151       48.1     2    48
 7   151       51.9     2    48
 8   152       86.1     1   100
 9   182       48       2    48
10   182       52       2    48

I don't want this, rather I want the following which I get using case_when

   foo %>% 
     group_by(Pos) %>%
     mutate(sumn = n()) %>% 
     mutate(Val = ifelse(sumn == 1,100,
                         #ifelse(sum(Percentage) == 100, unlist(map(Percentage,round)), 0)
                         case_when(sum(Percentage) == 100 ~ unlist(map(Percentage,round)),
                                   TRUE ~ 0
                         )
                         ))

# A tibble: 10 x 4
# Groups:   Pos [6]
     Pos Percentage  sumn   Val
   <int>      <dbl> <int> <dbl>
 1    73       81.2     1   100
 2   146       13.5     2     0
 3   146       86.4     2     0
 4   150       66.1     2    66
 5   150       33.9     2    34
 6   151       48.1     2    48
 7   151       51.9     2    52
 8   152       86.1     1   100
 9   182       48       2    48
10   182       52       2    52

What is ifelse doing different?

like image 374
smandape Avatar asked Dec 10 '25 00:12

smandape


1 Answers

According to ?ifelse

A vector of the same length and attributes (including dimensions and "class") as test and data values from the values of yes or no.

If we replicate to make the lengths same, then it should work

foo %>% 
      group_by(Pos) %>%
      mutate(sumn = n()) %>% 
      mutate(Val = ifelse(sumn == 1,100,
                          ifelse(rep(sum(Percentage) == 100, 
                      n()), unlist(map(Percentage,round)), 0)                            
                          ))
# A tibble: 10 x 4
# Groups:   Pos [6]
     Pos Percentage  sumn   Val
   <int>      <dbl> <int> <dbl>
 1    73       81.2     1   100
 2   146       13.5     2     0
 3   146       86.4     2     0
 4   150       66.1     2    66
 5   150       33.9     2    34
 6   151       48.1     2    48
 7   151       51.9     2    52
 8   152       86.1     1   100
 9   182       48       2    48
10   182       52       2    52
like image 62
akrun Avatar answered Dec 11 '25 13:12

akrun