I have a dataframe like this
id <-c("1","2","3")
col <- c("CHB_len_SCM_max","CHB_brf_SCM_min","CHB_PROC_S_SV_mean")
df <- data.frame(id,col)
I want to create 2 columns by separating the "col" into the measurement and stat. stat is basically the text after the last underscore (max,min,mean, etc)
My desired output is
id Measurement stat
1 CHB_len_SCM max
2 CHB_brf_SCM min
3 CHB_PROC_S_SV mean
I tried it this way but the stat column in empty. I am not sure if I am pointing to the last underscore.
library(tidyverse)
df1 <- df %>%
# Separate the sensors and the summary statistic
separate(col, into = c("Measurement", "stat"),sep = '\\_[^\\_]*$')
What am I missing here? Can someone point me in the right direction?
We could use extract
by capturing as two groups by making sure that the second group have one or more characters that are not a _
until the end ($
) of the string
library(tidyverse)
df %>%
extract(col, into = c("Measurement", "stat"), "(.*)_([^_]+)$")
# id Measurement stat
#1 1 CHB_len_SCM max
#2 2 CHB_brf_SCM min
#3 3 CHB_PROC_S_SV mean
Or using separate
with a regex lookaround
df %>%
separate(col, into = c("Measurement", "stat"), sep="_(?=[^_]+$)")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With