Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use case_when with entire dataframe?

Tags:

r

dplyr

I'd like to apply case_when to all columns in the data frame.

set.seed(1)
data <- tibble(x = runif(10), y = x * 2) 
data

For all columns above 0.5, I'd like to replace with a string ">0.5", for those above 1, I'd like to replace with ">1".

I've tried to case_when, but it appears that I have to specify the column like x and y. I'd like to use case_when without specifying columns and use it on the entire data frame instead.

like image 782
writer_typer Avatar asked Oct 20 '25 04:10

writer_typer


2 Answers

a purrr solution;

library(purrr)

data %>%
map_df(~case_when(.x > 0.5 & .x < 1 ~ ">0.5",
                  .x >= 1 ~ ">1"))

output;

   x     y    
   <chr> <chr>
 1 NA    >0.5 
 2 NA    >0.5 
 3 >0.5  >1   
 4 >0.5  >1   
 5 NA    NA   
 6 >0.5  >1   
 7 >0.5  >1   
 8 >0.5  >1   
 9 >0.5  >1   
10 NA    NA   
like image 177
Samet Sökel Avatar answered Oct 22 '25 17:10

Samet Sökel


Here is a potential solution:

library(tidyverse)

set.seed(1)
data <- tibble(x = runif(10), y = x * 2) 
data
#> # A tibble: 10 × 2
#>         x     y
#>     <dbl> <dbl>
#>  1 0.266  0.531
#>  2 0.372  0.744
#>  3 0.573  1.15 
#>  4 0.908  1.82 
#>  5 0.202  0.403
#>  6 0.898  1.80 
#>  7 0.945  1.89 
#>  8 0.661  1.32 
#>  9 0.629  1.26 
#> 10 0.0618 0.124

data %>%
  mutate(across(everything(),
                ~case_when(.x > 0.5 & .x < 1.0 ~ ">0.5",
                           .x >= 1.0 ~ ">1")))
#> # A tibble: 10 × 2
#>    x     y    
#>    <chr> <chr>
#>  1 <NA>  >0.5 
#>  2 <NA>  >0.5 
#>  3 >0.5  >1   
#>  4 >0.5  >1   
#>  5 <NA>  <NA> 
#>  6 >0.5  >1   
#>  7 >0.5  >1   
#>  8 >0.5  >1   
#>  9 >0.5  >1   
#> 10 <NA>  <NA>

Created on 2021-10-24 by the reprex package (v2.0.1)

like image 41
jared_mamrot Avatar answered Oct 22 '25 19:10

jared_mamrot