I have a unique character column which is in "DD HH:MM" I am trying to separate the column into 3 new numeric columns. My issue with the separate function is that is replaces the original column and I can't figure out how to pass multiple criteria through SEP =
Please note that not every observation has "DD"
My current code looks like this:
separate(numeric_report,Planned.Duration..dd.hh.mm., into = c("Planned Days" ,"Planned Hours" , "Planned Minutes"),sep = ":")
Any advice would be greatly appreciated!
You can use remove = FALSE
to stop from dropping original columns and you can use pass multiple sep
argument using |
.
tidyr::separate(data, Planned.Duration..dd.hh.mm.,
into = c("Planned Days" ,"Planned Hours" , "Planned Minutes"),
sep = ":|\\s", remove = FALSE, fill = "left", convert = TRUE)
# Planned.Duration..dd.hh.mm. Planned Days Planned Hours Planned Minutes
#1 1:03 NA 1 3
#2 2 00:38 2 0 38
#3 0:49 NA 0 49
data
data <- data.frame(Planned.Duration..dd.hh.mm. =
c('1:03', '2 00:38','0:49'), stringsAsFactors = FALSE)
We can use separate
library(dplyr)
library(tidyr)
data %>%
separate( Planned.Duration..dd.hh.mm.,
into = c("Planned Days" ,"Planned Hours" , "Planned Minutes"),
sep = "[: ]+", remove = FALSE, fill = "left", convert = TRUE)
# Planned.Duration..dd.hh.mm. Planned Days Planned Hours Planned Minutes
#1 1:03 NA 1 3
#2 2 00:38 2 0 38
#3 0:49 NA 0 49
We can also use extract
data %>%
tidyr::extract(Planned.Duration..dd.hh.mm.,
into = c("Planned Days" ,"Planned Hours" , "Planned Minutes"),
"^(\\d+)? ?(\\d+):(\\d+)$", remove = FALSE, convert = TRUE)
# Planned.Duration..dd.hh.mm. Planned Days Planned Hours Planned Minutes
#1 1:03 NA 1 3
#2 2 00:38 2 0 38
#3 0:49 NA 0 49
data <- data.frame(Planned.Duration..dd.hh.mm. =
c('1:03', '2 00:38','0:49'), stringsAsFactors = FALSE)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With