Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R - character string with week-Year: week is lost when converting to Date format

I have a character string of the date in Year-week format as such:

weeks.strings <- c("2002-26", "2002-27", "2002-28", "2002-29", "2002-30", "2002-31")

However, converting this character to Date class results in a loss of week identifier:

> as.Date(weeks.strings, format="%Y-%U")
[1] "2002-08-28" "2002-08-28" "2002-08-28" "2002-08-28" "2002-08-28"
[6] "2002-08-28"

As shown above, the format is converted into year- concatenated with today's date, so any information about the original week is lost (ex - when using the format function or strptime to try and coerce back into the original format.

One solution I found in a help group is to specify the day of the week:

as.Date(weeks.strings, format="%Y-%u %U")
[1] "2002-02-12" "2002-02-19" "2002-02-26" "2002-03-05" "2002-01-02"
[6] "2002-01-09"

But it looks like this results in incorrect week numbering (doesn't match the original string).

Any guidance would be appreciated.

like image 990
zzk Avatar asked Aug 28 '12 18:08

zzk


2 Answers

You just need to add a weekday to your weeks.strings in order to make the dates unambiguous (adapted from Jim Holtman's answer on R-help).

as.Date(paste(weeks.strings,1),"%Y-%U %u")

As pointed out in the comments, the Date class is not appropriate if the dates span a long horizon because--at some point--the chosen weekday will not exist in the first/last week of the year. In that case you could use a numeric vector where the whole portion is the year and the decimal portion is the fraction of weeks/year. For example:

wkstr <- sprintf("%d-%02d", rep(2000:2012,each=53), 0:52)
yrwk <- lapply(strsplit(wkstr, "-"), as.numeric)
yrwk <- sapply(yrwk, function(x) x[1]+x[2]/53)
like image 66
Joshua Ulrich Avatar answered Sep 24 '22 17:09

Joshua Ulrich


Obviously, there's no unique solution, since each week could be represented by any of up to 7 different dates. That said, here's one idea:

weeks.strings <- c("2002-26", "2002-27", "2002-28", "2002-29",
                   "2002-30", "2002-31")

x <- as.Date("2002-1-1", format="%Y-%m-%d") + (0:52*7)
x[match(weeks.strings, format(x, "%Y-%U"))]
# [1] "2002-07-02" "2002-07-09" "2002-07-16" "2002-07-23"
# [5] "2002-07-30" "2002-08-06"
like image 30
Josh O'Brien Avatar answered Sep 22 '22 17:09

Josh O'Brien