Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

seeking explanation for as.Date() function in R

Tags:

r

I'm converting day of year into dates, and I noticed that as.Date often returns unexpected (to me) results. Why do I get such different answers for these commands?

as.Date(x =  1, format = '%j', origin= '2015-01-01')

returns "2018-07-21"

as.Date(x = 1, origin= '2015-01-01')

returns "2015-01-02"

as.Date(x =  1, format = '%j', origin= as.Date('2015-01-01'))

returns "2015-01-02"

as.Date(x = '1',format = '%j', origin= '2015-01-01')

returns "2018-01-01"

as.Date(x = '1', origin= '2015-01-01')

returns an error: Error in charToDate(x) : character string is not in a standard unambiguous format

like image 665
filups21 Avatar asked Jun 22 '18 12:06

filups21


1 Answers

I have tried to partially answer the question below, by looking at the definition of the various methods under the S3 generic as.Date, as well as debugging your code via RStudio and looking at the history of functions called.

The definitions of as.Date.numeric, as.Date.character and as.Date.default are provided at the bottom of the answer.

I defined my own function check to debug what happens.

check <- function() {

as.Date(x =  1, format = '%j', origin= '2015-01-01')
as.Date(x = 1, origin= '2015-01-01')

}

In the first call, UseMethod of as.Date gets called, which dispatches it to as.Date.numeric. This is in turn calls as.Date(origin, ...) which now gets dispatched to as.Date.character. If you look at the source code of as.Date.character, the condition if missing(format) is FALSE, as the format %j has been provided in this case. So the piece of code which gets called is strptime(x, format, tz = "GMT"). This returns 2018-07-20 IST which gets converted to 2018-07-20 by the last call to as.Date. Note that the time zone may vary for you depending on which country you are in. strptime internally calls a C function which cannot be debugged using this process.

In the second call, the main difference is that the format string was not provided by the user. So following the same process above, what gets called is the function charToDate defined internally within as.Date.character instead of strptime as the condition if missing(format) is TRUE. In this case, charToDate tries the default formats and finds a match in '%Y-%m-%d. In this case, strptime is provided the correct format and computes the correct value 2015-01-01. This is now added to x, which is 1 - remember the character version was called by the numeric version where the code was as.Date(origin, ...) + x. This provides the correct answer.

While it does not provide a complete answer to your question, the general learning is that it is heavily dependent on the format string being passed to strptime. Hope this helps.

as.Date.numeric

function (x, origin, ...)
{
  if (missing(origin))
    stop("'origin' must be supplied")
  as.Date(origin, ...) + x
}

as.Date.character

function (x, format, tryFormats = c("%Y-%m-%d", "%Y/%m/%d"),
          optional = FALSE, ...)
{
  charToDate <- function(x) {
    xx <- x[1L]
    if (is.na(xx)) {
      j <- 1L
      while (is.na(xx) && (j <- j + 1L) <= length(x)) xx <- x[j]
      if (is.na(xx))
        f <- "%Y-%m-%d"
    }
    if (is.na(xx))
      strptime(x, f)
    else {
      for (ff in tryFormats) if (!is.na(strptime(xx, ff,
                                                 tz = "GMT")))
        return(strptime(x, ff))
      if (optional)
        as.Date.character(rep.int(NA_character_, length(x)),
                          "%Y-%m-%d")
      else stop("character string is not in a standard unambiguous format")
    }
  }
  res <- if (missing(format))
    charToDate(x)
  else strptime(x, format, tz = "GMT")
  as.Date(res)
}

as.Date.default

function (x, ...)
{
  if (inherits(x, "Date"))
    x
  else if (is.logical(x) && all(is.na(x)))
    .Date(as.numeric(x))
  else stop(gettextf("do not know how to convert '%s' to class %s",
                     deparse(substitute(x)), dQuote("Date")), domain = NA)
}
like image 148
radmuzom Avatar answered Nov 15 '22 03:11

radmuzom