I'm converting day of year into dates, and I noticed that as.Date
often returns unexpected (to me) results. Why do I get such different answers for these commands?
as.Date(x = 1, format = '%j', origin= '2015-01-01')
returns "2018-07-21"
as.Date(x = 1, origin= '2015-01-01')
returns "2015-01-02"
as.Date(x = 1, format = '%j', origin= as.Date('2015-01-01'))
returns "2015-01-02"
as.Date(x = '1',format = '%j', origin= '2015-01-01')
returns "2018-01-01"
as.Date(x = '1', origin= '2015-01-01')
returns an error: Error in charToDate(x) :
character string is not in a standard unambiguous format
I have tried to partially answer the question below, by looking at the definition of the various methods
under the S3 generic as.Date
, as well as debugging your code via RStudio and looking at the history of functions called.
The definitions of as.Date.numeric
, as.Date.character
and as.Date.default
are provided at the bottom of the answer.
I defined my own function check
to debug what happens.
check <- function() {
as.Date(x = 1, format = '%j', origin= '2015-01-01')
as.Date(x = 1, origin= '2015-01-01')
}
In the first call, UseMethod
of as.Date
gets called, which dispatches it to as.Date.numeric
. This is in turn calls as.Date(origin, ...)
which now gets dispatched to as.Date.character
. If you look at the source code of as.Date.character
, the condition if missing(format)
is FALSE, as the format %j
has been provided in this case. So the piece of code which gets called is strptime(x, format, tz = "GMT")
. This returns 2018-07-20 IST
which gets converted to 2018-07-20
by the last call to as.Date
. Note that the time zone may vary for you depending on which country you are in. strptime
internally calls a C function which cannot be debugged using this process.
In the second call, the main difference is that the format string was not provided by the user. So following the same process above, what gets called is the function charToDate
defined internally within as.Date.character
instead of strptime
as the condition if missing(format)
is TRUE. In this case, charToDate
tries the default formats and finds a match in '%Y-%m-%d
. In this case, strptime
is provided the correct format and computes the correct value 2015-01-01
. This is now added to x
, which is 1 - remember the character version was called by the numeric version where the code was as.Date(origin, ...) + x
. This provides the correct answer.
While it does not provide a complete answer to your question, the general learning is that it is heavily dependent on the format string being passed to strptime
. Hope this helps.
as.Date.numeric
function (x, origin, ...)
{
if (missing(origin))
stop("'origin' must be supplied")
as.Date(origin, ...) + x
}
as.Date.character
function (x, format, tryFormats = c("%Y-%m-%d", "%Y/%m/%d"),
optional = FALSE, ...)
{
charToDate <- function(x) {
xx <- x[1L]
if (is.na(xx)) {
j <- 1L
while (is.na(xx) && (j <- j + 1L) <= length(x)) xx <- x[j]
if (is.na(xx))
f <- "%Y-%m-%d"
}
if (is.na(xx))
strptime(x, f)
else {
for (ff in tryFormats) if (!is.na(strptime(xx, ff,
tz = "GMT")))
return(strptime(x, ff))
if (optional)
as.Date.character(rep.int(NA_character_, length(x)),
"%Y-%m-%d")
else stop("character string is not in a standard unambiguous format")
}
}
res <- if (missing(format))
charToDate(x)
else strptime(x, format, tz = "GMT")
as.Date(res)
}
as.Date.default
function (x, ...)
{
if (inherits(x, "Date"))
x
else if (is.logical(x) && all(is.na(x)))
.Date(as.numeric(x))
else stop(gettextf("do not know how to convert '%s' to class %s",
deparse(substitute(x)), dQuote("Date")), domain = NA)
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With