Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generating dynamic variables referencing existing variables

Tags:

I'm trying to generate a correlation matrix with significance stars. Take the following dataframe:

df <- tibble(stub = c(1,2,3,4),
             stub_pvalue = c(.00, .04, .07,.2))

I'd like to write a function that pastes any column (e.g. stub in this example) concatenated with "***" if stub_pvalue is less than .01, and otherwise simply pastes stub. Something like:

assign_stars <- function(var) {

    if (paste0(var,"_pvalue") < .01) {
      paste0(var, "***")
    } else {
      paste0(var)
    }

}

df %>% 
  mutate(col_with_stars = map_chr(col, assign_stars))

However, I can't figure out how to have the if's first logical condition evaluate on the var + "_pvalue". Can anyone help?

like image 682
Louis Maiden Avatar asked Aug 22 '19 20:08

Louis Maiden


People also ask

Why are dynamic variables bad?

Summary: Dynamically accessing variable names can negatively impact the readability of your code and can cause it to run slower by preventing MATLAB from optimizing it as well as it could if you used alternate techniques. The most common alternative is to use simple and efficient indexing.

What happens when the dynamic variable $Randomint is added?

Dynamic variable, as the name suggests, generates random data to your API requests and collection run. It is like a random library in a programming language. It generates random data each time in response like a random number, alphabets, alphanumeric or color, etc. A dynamic variable name starts with '$.


1 Answers

assign_stars <- function(df, var, threshold, marker) {

  require(dplyr)
  require(rlang)

  var <- sym(var)
  val <- sym(paste(var, "pvalue" , sep="_"))
  out <- sym(paste(var, "marker" , sep="_"))

  mutate(df, !!out := if_else(!!val < threshold, 
                              paste0(!!var, marker),
                              as.character(!!var)
                              )
         ) 
}

If we wanted to do this only for one column, then following works:

df %>% 
  assign_stars(., "stub", 0.01, "***")

# # A tibble: 4 x 5
#    stub stub_pvalue  stub_marker
#    <dbl>      <dbl>  <chr>      
# 1     1        0     1***       
# 2     2        0.04  2          
# 3     3        0.07  3          
# 4     4        0.2   4  

But if we want to pass multiple columns to this function, we need to use purrr:

#sample data with multiple sets of columns:
df <- tibble(stub = c(1,2,3,4),
             stub_pvalue = c(.00, .04, .07,.2),
             sho = c(8,7,6,5),
             sho_pvalue = c(.005, .03, .00,.24))
library(purrr)  

pmap_dfc(list(c("stub", "sho")), ~ assign_stars(df, ..1, 0.01, "***")) %>% 
  select(!! names(df), ends_with("marker"))

#> # A tibble: 4 x 6
#>    stub stub_pvalue   sho sho_pvalue stub_marker sho_marker
#>   <dbl>       <dbl> <dbl>      <dbl> <chr>       <chr>     
#> 1     1        0        8      0.005 1***        8***      
#> 2     2        0.04     7      0.03  2           7         
#> 3     3        0.07     6      0     3           6***      
#> 4     4        0.2      5      0.24  4           5

We can also use different threshold and marker for each column:

library(purrr)  

pmap_dfc(list(c("stub", "sho"), c(0.01, 0.04), c("*", "**")), 
         ~ assign_stars(df, ..1, ..2, ..3)) %>% 
   select(!! names(df), ends_with("marker"))

#> # A tibble: 4 x 6
#>    stub stub_pvalue   sho sho_pvalue stub_marker sho_marker
#>   <dbl>       <dbl> <dbl>      <dbl> <chr>       <chr>     
#> 1     1        0        8      0.005 1*          8**       
#> 2     2        0.04     7      0.03  2           7**       
#> 3     3        0.07     6      0     3           6**       
#> 4     4        0.2      5      0.24  4           5
like image 124
M-- Avatar answered Nov 15 '22 07:11

M--