I have the dataframe of the following type
df <- tibble::tribble(~x,
c("A", "B"),
c("A", "B", "C"),
c("A", "B", "C", "D"),
c("A", "B"))
and vectors like these
vec1 <- c("A", "B")
vec2 <- c("A", "B", "C")
vec3 <- c("A", "B", "C", "D")
I want to mutate a variable y that shows which row has which vector. I tried the following, but getting the empty y variable with the warning: "longer object length is not a multiple of shorter object length"
df_new <- df %>%
mutate(y = case_when(x == vec1 ~ "vec1",
x == vec2 ~ "vec2",
x == vec2 ~ "vec3"))
The desired output is
df_new <- tibble::tribble(~x, ~y,
c("A", "B"), "vec1",
c("A", "B", "C"), "vec2",
c("A", "B", "C", "D"), "vec3",
c("A", "B"), "vec1")
mutate() adds new variables and preserves existing ones; transmute() adds new variables and drops existing ones. New variables overwrite existing variables of the same name. Variables can be removed by setting their value to NULL .
To create new variables from existing variables, use the case when() function from the dplyr package in R.
The dplyr package is an add-on to R. It includes a host of cool functions for selecting, filtering, grouping, and arranging data. It also includes the mutate function.
A solution using map2_lgl
and identical
to assess if the vectors are the same.
library(tidyverse)
df_new <- df %>%
mutate(y = case_when(
map2_lgl(x, list(vec1), ~identical(.x, .y)) ~"vec1",
map2_lgl(x, list(vec2), ~identical(.x, .y)) ~"vec2",
map2_lgl(x, list(vec3), ~identical(.x, .y)) ~"vec3"
))
df_new
# # A tibble: 4 x 2
# x y
# <list> <chr>
# 1 <chr [2]> vec1
# 2 <chr [3]> vec2
# 3 <chr [4]> vec3
# 4 <chr [2]> vec1
Here's an alternative that's more programmatic - you don't need to specify each vector explicitly
Data
df <- tibble::tribble(~x,
c("A", "B"),
c("A", "B", "C"),
c("A", "B", "C", "D"),
c("A", "B"))
vec1 <- c("A", "B")
vec2 <- c("A", "B", "C")
vec3 <- c("A", "B", "C", "D")
Solution - takes advantage of ls(...)
to return relevant vector names using a pattern
vecs <- ls(pattern="vec")
L <- lapply(vecs, get)
names(L) <- vecs
df %>%
mutate(y = names(L)[match(x, L)])
# A tibble: 4 x 2
# x y
# <list> <chr>
# 1 <chr [2]> vec1
# 2 <chr [3]> vec2
# 3 <chr [4]> vec3
# 4 <chr [2]> vec1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With