I am trying to extract the first few characters from a column in a data frame. What I need is first few character till a "," is encountered.
Data:
texts
12/5/15, 11:49 - thanks, take care
12/5/15, 11:51 - cool
What I need is
texts date
12/5/15, 11:49 - thanks, take care 12/5/15
12/10/15, 11:51 - cool 12/10/15
I tired using this, but this returned everything without the columns
df$date <- sub(", ", "", df$date, fixed = TRUE)
and
df$date <- gsub( ".,","", df$texts)
Excel equivalent
=LEFT(A1, FIND(",",A1,1)-1)
You can use sub:
sub('(^.*?),.*', '\\1', df$texts)
# [1] "12/5/15" "12/5/15"
The pattern matches
^ followed by any character . repeated zero to infinity times, but as few as possible *?, all captured ( ... ),.*which will match the whole line, and replaces it with
\\1.Other options: substr, strsplit, stringr::str_extract.
If you're planning on using said dates, as.Date (or strptime, if you want the times too) can actually strip out what it needs:
as.Date(df$texts, '%m/%d/%y')` # or '%d/%m/%y', if that's the format
# [1] "2015-12-05" "2015-12-05"
Data:
df <- structure(list(texts = structure(1:2, .Label = c("12/5/15, 11:49 - thanks, take care",
"12/5/15, 11:51 - cool"), class = "factor")), .Names = "texts",
class = "data.frame", row.names = c(NA, -2L))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With