Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In R, is the %OSn time format only valid for formatting, but not parsing?

Consider this R code, which uses a defined time format string (the timeFormat variable below) to format and parse dates:


time = as.POSIXct(1433867059, origin = "1970-01-01")
print(time)
print( as.numeric(time) )

timeFormat = "%Y-%m-%d %H:%M:%OS3"
tz = "EST"

timestamp = format(time, format = timeFormat, tz = tz)
print(timestamp)

timeParsed = as.POSIXct(timestamp, format = timeFormat, tz = tz)
print(timeParsed)
print( as.numeric(timeParsed) )

If I paste that into Rgui on my Windows box, which is running the latest (3.2.0) stable release, I get this:


> time = as.POSIXct(1433867059, origin = "1970-01-01")
> print(time)
[1] "2015-06-09 12:24:19 EDT"
> print( as.numeric(time) )
[1] 1433867059
> 
> timeFormat = "%Y-%m-%d %H:%M:%OS3"
> tz = "EST"
> 
> timestamp = format(time, format = timeFormat, tz = tz)
> print(timestamp)
[1] "2015-06-09 11:24:19.000"
> 
> timeParsed = as.POSIXct(timestamp, format = timeFormat, tz = tz)
> print(timeParsed)
[1] NA
> print( as.numeric(timeParsed) )
[1] NA

Notice how the time format, which ends with %OS3, produces the correct time stamp (a 3 digit millisecond resolution).

However, that same time format cannot parse that time stamp back into the original POSIXct value; it barfs and parses NA.

Anyone know what is going on?

A web search found this stackoverflow link, where one of the commenters, Waldir Leoncio, in the first answer, appears to describe the same parsing bug with %OS3 that I do:

"use, for example, strptime(y, "%d.%m.%Y %H:%M:%OS3"), but it doesn't work for me. Henrik noted that the function's help page, ?strptime states that the %OS3 bit is OS-dependent. I'm using an updated Ubuntu 13.04 and using %OS3 yields NA."

The help page mentioned in the quote above likely is this link, which is unfortunately terse, merely saying

"Specific to R is %OSn, which for output gives the seconds truncated to 0 <= n <= 6 decimal places (and if %OS is not followed by a digit, it uses the setting of getOption("digits.secs"), or if that is unset, n = 3). Further, for strptime %OS will input seconds including fractional seconds. Note that %S ignores (and not rounds) fractional parts on output."

That final senetence about strptime (i.e. parsing) is subtle: it says "for strptime %OS". Note the absence of an 'n': it says %OS instead of %OSn.

Does that mean that %OSn can NOT be used for parsing, only for formatting?

That is what I have empirically found, but is it expected behavior or a bug?

Very annoying if expected behavior, since that means that I need different time formats for formatting and parsing. Have never seen that before in any other language's date API...

(Aside: I am aware that there is another issue, even if you just want to format, with %OSn: R truncates fractional parts instead of rounds. For those not aware of this bad behavior, its hazards are discussed here, here, and here.)

like image 309
HaroldFinch Avatar asked Nov 01 '22 01:11

HaroldFinch


1 Answers

This is expected behavior, not a bug. "%OSn" is for output. "%OS" is for input, and includes fractional seconds, as it says in your second blockquote:

Further, for strptime %OS will input seconds including fractional seconds.

options(digits.secs=6)
as.POSIXct("2015-06-09 11:24:19.002", "America/New_York", "%Y-%m-%d %H:%M:%OS")
# [1] "2015-06-09 11:24:19.002 EDT"

Also note that "EST" is an ambiguous timezone, and probably not what you expect. See the Time zone names section of ?timezone.

like image 63
Joshua Ulrich Avatar answered Nov 09 '22 12:11

Joshua Ulrich