I wrote code to extract the date from a given string. Given
> "Date: 2012-07-29, 12:59AM PDT"
it extracts
> "2012-07-29"
The problem is my code looks lengthy and cumbersome to read. I was wondering if was a more elegant way of doing this.
raw_date = "Date: 2012-07-29, 12:59AM PDT"
#extract the string from raw date
index = regexpr("[0-9]{4}-[0-9]{2}-[0-9]{2}", raw_date) #returns 'start' and 'end' to be used in substring
start = index #start represents the character position 's'. start+1 represents '='
end = attr(index, "match.length")+start-1
date = substr(raw_date,start,end); date
As (pretty much) always, you've got multiple options here. Though none of them really frees you from getting used to some basic regular expression syntax (or its close friends).
raw_date <- "Date: 2012-07-29, 12:59AM PDT"
> gsub(",", "", unlist(strsplit(raw_date, split=" "))[2])
[1] "2012-07-29"
> temp <- gsub(".*: (?=\\d?)", "", raw_date, perl=TRUE)
> out <- gsub("(?<=\\d),.*", "", temp, perl=TRUE)
> out
[1] "2012-07-29"
> require("stringr")
> str_extract(raw_date, "\\d{4}-\\d{2}-\\d{2}")
[1] "2012-07-29"
Something along the lines of this should work:
x <- "Date: 2012-07-29, 12:59AM PDT"
as.Date(substr(x, 7, 16), format="%Y-%m-%d")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With