I'm trying to generate a correlation matrix with significance stars. Take the following dataframe:
df <- tibble(stub = c(1,2,3,4),
stub_pvalue = c(.00, .04, .07,.2))
I'd like to write a function that pastes any column (e.g. stub in this example) concatenated with "***" if stub_pvalue is less than .01, and otherwise simply pastes stub. Something like:
assign_stars <- function(var) {
if (paste0(var,"_pvalue") < .01) {
paste0(var, "***")
} else {
paste0(var)
}
}
df %>%
mutate(col_with_stars = map_chr(col, assign_stars))
However, I can't figure out how to have the if's first logical condition evaluate on the var + "_pvalue". Can anyone help?
Summary: Dynamically accessing variable names can negatively impact the readability of your code and can cause it to run slower by preventing MATLAB from optimizing it as well as it could if you used alternate techniques. The most common alternative is to use simple and efficient indexing.
Dynamic variable, as the name suggests, generates random data to your API requests and collection run. It is like a random library in a programming language. It generates random data each time in response like a random number, alphabets, alphanumeric or color, etc. A dynamic variable name starts with '$.
assign_stars <- function(df, var, threshold, marker) {
require(dplyr)
require(rlang)
var <- sym(var)
val <- sym(paste(var, "pvalue" , sep="_"))
out <- sym(paste(var, "marker" , sep="_"))
mutate(df, !!out := if_else(!!val < threshold,
paste0(!!var, marker),
as.character(!!var)
)
)
}
If we wanted to do this only for one column, then following works:
df %>%
assign_stars(., "stub", 0.01, "***")
# # A tibble: 4 x 5
# stub stub_pvalue stub_marker
# <dbl> <dbl> <chr>
# 1 1 0 1***
# 2 2 0.04 2
# 3 3 0.07 3
# 4 4 0.2 4
But if we want to pass multiple columns to this function, we need to use purrr
:
#sample data with multiple sets of columns:
df <- tibble(stub = c(1,2,3,4),
stub_pvalue = c(.00, .04, .07,.2),
sho = c(8,7,6,5),
sho_pvalue = c(.005, .03, .00,.24))
library(purrr)
pmap_dfc(list(c("stub", "sho")), ~ assign_stars(df, ..1, 0.01, "***")) %>%
select(!! names(df), ends_with("marker"))
#> # A tibble: 4 x 6
#> stub stub_pvalue sho sho_pvalue stub_marker sho_marker
#> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
#> 1 1 0 8 0.005 1*** 8***
#> 2 2 0.04 7 0.03 2 7
#> 3 3 0.07 6 0 3 6***
#> 4 4 0.2 5 0.24 4 5
We can also use different threshold
and marker
for each column:
library(purrr)
pmap_dfc(list(c("stub", "sho"), c(0.01, 0.04), c("*", "**")),
~ assign_stars(df, ..1, ..2, ..3)) %>%
select(!! names(df), ends_with("marker"))
#> # A tibble: 4 x 6
#> stub stub_pvalue sho sho_pvalue stub_marker sho_marker
#> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
#> 1 1 0 8 0.005 1* 8**
#> 2 2 0.04 7 0.03 2 7**
#> 3 3 0.07 6 0 3 6**
#> 4 4 0.2 5 0.24 4 5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With