I have some data that looks like this:
X1
A,B,C,D,E
A,B
A,B,C,D
A,B,C,D,E,F
I want to generate one column that holds the first element of each vector ("A"), and another column that holds all the rest of the values ("B","C" etc.):
X1 Col1 Col2
A,B,C,D,E A B,C,D,E
A,B A B
A,B,C,D A B,C,D
A,B,C,D,E,F A B,C,D,E,F
I have tried the following:
library(dplyr)
testdata <- data.frame(X1 = c("A,B,C,D,E",
"A,B",
"A,B,C,D",
"A,B,C,D,E,F")) %>%
mutate(Col1 = sapply(strsplit(X1, ","), "[", 1),
Col2 = sapply(strsplit(X1, ","), "[", -1))
However I cannot seem to get rid of the pesky vector brackets around the values in Col2. Any way of doing this?
You can use tidyr::separate with extra = "merge":
testdata %>%
tidyr::separate(X1, into = c("Col1","Col2"), sep = ",", extra = "merge", remove = F)
X1 Col1 Col2
1 A,B,C,D,E A B,C,D,E
2 A,B A B
3 A,B,C,D A B,C,D
4 A,B,C,D,E,F A B,C,D,E,F
A possible solution, using tidyr::separate:
library(tidyverse)
df <- data.frame(
stringsAsFactors = FALSE,
X1 = c("A,B,C,D,E", "A,B", "A,B,C,D", "A,B,C,D,E,F")
)
df %>%
separate(X1, into = str_c("col", 1:2), sep = "(?<=^.),", remove = F)
#> X1 col1 col2
#> 1 A,B,C,D,E A B,C,D,E
#> 2 A,B A B
#> 3 A,B,C,D A B,C,D
#> 4 A,B,C,D,E,F A B,C,D,E,F
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With