I'm looking for help how I can split complex names column into 2 columns for first and last names.
df <- data.frame( PREFIX=c("A_B","A_C","A_D","B_A","A_B_C","B_D_E","C_B_A","B_A"),
VALUE=c(1,2,3,4,5,6,7,8) )
The following produces the first part of the task, but I couldn't figure out how to select the last element when I have different pattern for the remaining string
# split PREFIX into new columns
df$name1 = as.character(lapply(strsplit(as.character(df$PREFIX), split="_"), "[", 1))
You can use tail
to grab the last element:
df$name2 = as.character(lapply(strsplit(as.character(df$PREFIX), split="_"),
tail, n=1))
df
# PREFIX VALUE name1 name2
# 1 A_B 1 A B
# 2 A_C 2 A C
# 3 A_D 3 A D
# 4 B_A 4 B A
# 5 A_B_C 5 A C
# 6 B_D_E 6 B E
# 7 C_B_A 7 C A
# 8 B_A 8 B A
You can also use a "greedy" regular expression:
cbind(df, do.call(rbind, strsplit(as.character(df$PREFIX), "_|_.*_")))
# PREFIX VALUE 1 2
# 1 A_B 1 A B
# 2 A_C 2 A C
# 3 A_D 3 A D
# 4 B_A 4 B A
# 5 A_B_C 5 A C
# 6 B_D_E 6 B E
# 7 C_B_A 7 C A
# 8 B_A 8 B A
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With