I would like to modify column attributes inside of dplyr::mutate(). Here is some example data:
library(dplyr, warn.conflicts = FALSE)
v1 <- tibble(
id = 1:5,
visit = 1,
x = 1:5,
y = c(0, 0, 0, 1, 1)
)
The end result I would like to get is the same result I would get from:
attr(v1$id, "source") <- "Collected at v1"
attr(v1$visit, "source") <- "Collected at v1"
attr(v1$x, "source") <- "Collected at v1"
attr(v1$y, "source") <- "Collected at v1"
I know I can just use a for loop.
for (col in names(v1)) {
attr(v1[[col]], "source") <- "Collected at v1"
}
However, I have reasons for wanting to do this inside of dplyr. I'm trying to get to something like this.
v1 %>%
mutate(
across(
.cols = everything(),
.fns = ~ function_to_update_attributes("source", "Collected at v1")
)
)
Right now, I can't even get this to work for a single variable. This is the closest I've gotten.
v1 %>%
mutate(
id = `<-`(attr(.[["id"]], "source"), "Collected at v1")
)
Which returns
# A tibble: 5 × 4
id visit x y
<chr> <dbl> <int> <dbl>
1 Collected at v1 1 1 0
2 Collected at v1 1 2 0
3 Collected at v1 1 3 0
4 Collected at v1 1 4 1
5 Collected at v1 1 5 1
😞 Any constructive feedback is welcome and appreciated!
This is also posted on RStudio Community at: https://community.rstudio.com/t/modify-arbitrary-column-attributes-using-dplyr-mutate/144502
We can directly apply the attributes on the vector with dplyr
library(dplyr)
v1 <- v1 %>%
mutate(across(everything(),
~ {attr(.x, "source") <- "Collected at v1"
.x} ))
-output
> str(v1)
tibble [5 × 4] (S3: tbl_df/tbl/data.frame)
$ id : int [1:5] 1 2 3 4 5
..- attr(*, "source")= chr "Collected at v1"
$ visit: num [1:5] 1 1 1 1 1
..- attr(*, "source")= chr "Collected at v1"
$ x : int [1:5] 1 2 3 4 5
..- attr(*, "source")= chr "Collected at v1"
$ y : num [1:5] 0 0 0 1 1
..- attr(*, "source")= chr "Collected at v1"
Or if we want to use access the column name use cur_column() and assign with <<-
v1 %>%
mutate(across(everything(),
~ {attr(v1[[cur_column()]], "source") <<- "Collected at v1"
.x}))
-output
> str(v1)
tibble [5 × 4] (S3: tbl_df/tbl/data.frame)
$ id : int [1:5] 1 2 3 4 5
..- attr(*, "source")= chr "Collected at v1"
$ visit: num [1:5] 1 1 1 1 1
..- attr(*, "source")= chr "Collected at v1"
$ x : int [1:5] 1 2 3 4 5
..- attr(*, "source")= chr "Collected at v1"
$ y : num [1:5] 0 0 0 1 1
..- attr(*, "source")= chr "Collected at v1"
Or if we want to replicate the same behavior i.e. making use of the original data object name along with its columns names as in for loop, use reduce
library(purrr)
v1 <- reduce(names(v1), ~ {
attr(.x[[.y]], "source") <- "Collected at v1"
.x}, .init = v1)
-output
> str(v1)
tibble [5 × 4] (S3: tbl_df/tbl/data.frame)
$ id : int [1:5] 1 2 3 4 5
..- attr(*, "source")= chr "Collected at v1"
$ visit: num [1:5] 1 1 1 1 1
..- attr(*, "source")= chr "Collected at v1"
$ x : int [1:5] 1 2 3 4 5
..- attr(*, "source")= chr "Collected at v1"
$ y : num [1:5] 0 0 0 1 1
..- attr(*, "source")= chr "Collected at v1"
You can do it directly in a mutate(). You need to use set_attr() from the magrittr package.
mt2 <- mtcars %>% mutate(cyl2 = magrittr::set_attr(cyl, "source", "collected at v1"))
mt2 <- mtcars %>% mutate(across(everything(),
.fns = function(x) magrittr::set_attr(x, "source", "collected at v1")))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With