Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R issue with rounding milliseconds

Given the following issue with rounding milliseconds under R. How do I get around it so that the times are correct?

> options(digits.secs=3)
> as.POSIXlt("13:29:56.061", format='%H:%M:%OS', tz='UTC')
[1] "2012-06-07 13:29:56.060 UTC"
> as.POSIXlt("13:29:56.062", format='%H:%M:%OS', tz='UTC')
[1] "2012-06-07 13:29:56.061 UTC"
> as.POSIXlt("13:29:56.063", format='%H:%M:%OS', tz='UTC')
[1] "2012-06-07 13:29:56.063 UTC"

I noticed that this URL provides background information but doesn't solve my issue: Milliseconds puzzle when calling strptime in R.

Also this URL touches on the issue but doesn't solve it: R xts: .001 millisecond in index.

In these cases I do see the following:

> x <- as.POSIXlt("13:29:56.061", format='%H:%M:%OS', tz='UTC')
> print(as.numeric(x), digits=20)
[1] 1339075796.0610001087

The URL also seems to indicate that this is just a display issue but I've noticed that using statements like "%OS3" without the options line don't seem to pickup the correct number of digits.

The version I'm using is 32 bit 2.15.0 under Windows but this seems to exist under other situations for R.

Note that my original data is these date time strings within a CSV file I must find a way of converting them into the correct millisecond time from a string.

like image 586
Andrew Stern Avatar asked Jun 07 '12 12:06

Andrew Stern


2 Answers

I don't see that:

> options(digits.secs = 4)
> as.POSIXlt("13:29:56.061", format = '%H:%M:%OS', tz='UTC')
[1] "2012-06-07 13:29:56.061 UTC"
> as.POSIXlt("13:29:56.062", format = '%H:%M:%OS', tz='UTC')
[1] "2012-06-07 13:29:56.062 UTC"
> as.POSIXlt("13:29:56.063", format = '%H:%M:%OS', tz='UTC')
[1] "2012-06-07 13:29:56.063 UTC"
> options(digits.secs = 3)
> as.POSIXlt("13:29:56.061", format = '%H:%M:%OS', tz='UTC')
[1] "2012-06-07 13:29:56.061 UTC"
> as.POSIXlt("13:29:56.062", format = '%H:%M:%OS', tz='UTC')
[1] "2012-06-07 13:29:56.062 UTC"
> as.POSIXlt("13:29:56.063", format = '%H:%M:%OS', tz='UTC')
[1] "2012-06-07 13:29:56.063 UTC"

with

> sessionInfo()
R version 2.15.0 Patched (2012-04-14 r59019)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_GB.utf8       LC_NUMERIC=C             
 [3] LC_TIME=en_GB.utf8        LC_COLLATE=en_GB.utf8    
 [5] LC_MONETARY=en_GB.utf8    LC_MESSAGES=en_GB.utf8   
 [7] LC_PAPER=C                LC_NAME=C                
 [9] LC_ADDRESS=C              LC_TELEPHONE=C           
[11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C      

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base

With the "%OSn" format strings, one forces truncation. If the fractional second cannot be represented exactly in floating points then the truncation may very well go the wrong way. If you see things going to wrong way you can also round explicitly to the unit you want or add a half of the fraction you wish to operate at (in the case shown 0.0005):

> t1 <- as.POSIXlt("13:29:56.061", format = '%H:%M:%OS', tz='UTC')
> t1
[1] "2012-06-07 13:29:56.061 UTC"
> t1 + 0.0005
[1] "2012-06-07 13:29:56.061 UTC"

(but a I said, I don't see the problem here.)

This latter point was made by Simon Urbanek on the R-Devel mailing list on 30-May-2012.

like image 90
Gavin Simpson Avatar answered Sep 22 '22 07:09

Gavin Simpson


This is the same problem as Milliseconds puzzle when calling strptime in R.

Your example:

> x <- as.POSIXlt("13:29:56.061", format='%H:%M:%OS', tz='UTC')
> print(as.numeric(x), digits=20)
[1] 1339075796.0610001087

is not representative of the problem. as.numeric(x) converts your POSIXlt object to POSIXct before converting to numeric, so you get different floating-point-precision rounding errors.

That's not how print.POSIXlt (which calls format.POSIXlt) works. format.POSIXlt formats each element of the POSIXlt list construct individually, so you would need to look at:

print(x$sec, digits=20)
[1] 56.060999999999999943

And that number is truncated at the third decimal place, so you see 56.060. You can see this by calling format directly:

> format(x, "%H:%M:%OS6")
[1] "13:29:56.060999"
like image 29
Joshua Ulrich Avatar answered Sep 20 '22 07:09

Joshua Ulrich