Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

as.Date(as.POSIXct()) gives the wrong date?

I'd been trying to look through a dataframe extracting all rows where the date component of a POSIXct column matched a certain value.I came across the following which is confusing me mightily:: as.Date(as.POSIXct(...)) doesn't always return the correct date.

> dt <- as.POSIXct('2012-08-06 09:35:23')
[1] "2012-08-06 09:35:23 EST"
> as.Date(dt)
[1] "2012-08-05"

Why is the date of '2012-08-06 09:35:23' equal to '2012-08-05?

I suspect it's something to do with different timezones being used, so noting that the timezone of dt was 'EST' I gave this to as.Date::

> as.Date(as.POSIXct('2012-08-06 09:35:23'), tz='EST')
[1] "2012-08-05"

But it still returns 2012-08-05.

Why is this? How can I find all datetimes in my dataframe that were on the date 2012-08-06? (as subset(my.df, as.character(as.Date(datetime), tz='EST') == '2012-08-06') does not return the row with datetime dt even though this did occur on the date 2012-08-06...)?

Added details: Linux 64bit (though can reproduce on 32bit), can get this on both R 3.0.1 & 3.0.0, and I am currently AEST (Australian Eastern Standard Time)

like image 646
mathematical.coffee Avatar asked Jun 13 '13 23:06

mathematical.coffee


People also ask

Is POSIXct a date?

POSIXct stores both a date and time with an associated time zone. The default time zone selected, is the time zone that your computer is set to which is most often your local time zone. POSIXct stores date and time in seconds with the number of seconds beginning at 1 January 1970.

What is the difference between POSIXct and POSIXlt and as date?

There are two POSIX date/time classes, which differ in the way that the values are stored internally. The POSIXct class stores date/time values as the number of seconds since January 1, 1970, while the POSIXlt class stores them as a list with elements for second, minute, hour, day, month, and year, among others.

How do I change date format in R?

To format = , provide a character string (in quotes) that represents the current date format using the special “strptime” abbreviations below. For example, if your character dates are currently in the format “DD/MM/YYYY”, like “24/04/1968”, then you would use format = "%d/%m/%Y" to convert the values into dates.

What does as date do in R?

You can use the as. Date( ) function to convert character data to dates. The format is as. Date(x, "format"), where x is the character data and format gives the appropriate format.


1 Answers

The safe way to do this is to pass the date value through format. This does create an additional step but as.Date will accept the character result if it is formated with a "-" or "/":

as.Date( format( as.POSIXct('2019-03-11 23:59:59'), "%Y-%m-%d") )
[1] "2019-03-11"

as.Date(  as.POSIXct('2019-03-11 23:59:59') ) # I'm in a locale where the problem might exist
[1] "2019-03-12"

The documentation for timezones is confusing to me too. In some (and this case as it turned out) case EST may not be unambiguous and may actually refer to a tz in Australia. Try "EST5EDT" or "America/New_York" if you happen to be in North America.

In this case it could also relate to differences in how your unstated OS handles the 'tz' argument, since I get "2012-08-06". ( I'm in PDT US tz at the moment, although I'm not sure that should matter. )Changing which function gets the tz argument may clarify (or not):

> as.Date(as.POSIXct('2012-08-06 19:35:23', tz='EST'))
[1] "2012-08-07"
> as.Date(as.POSIXct('2012-08-06 17:35:23', tz='EST'))
[1] "2012-08-06"


> as.Date(as.POSIXct('2012-08-06 21:35:23'), tz='EST')
[1] "2012-08-06"
> as.Date(as.POSIXct('2012-08-06 22:35:23'), tz='EST')
[1] "2012-08-07"

If you omit the tz from as.POSIXct then UTC is assumed.

These are the unambiguous names of the Ozzie TZ's (at least on my Mac):

tzfile <- "/usr/share/zoneinfo/zone.tab"
tzones <- read.delim(tzfile, row.names = NULL, header = FALSE,
    col.names = c("country", "coords", "name", "comments"),
    as.is = TRUE, fill = TRUE, comment.char = "#")
grep("^Aus", tzones$name, value=TRUE)
 [1] "Australia/Lord_Howe"   "Australia/Hobart"     
 [3] "Australia/Currie"      "Australia/Melbourne"  
 [5] "Australia/Sydney"      "Australia/Broken_Hill"
 [7] "Australia/Brisbane"    "Australia/Lindeman"   
 [9] "Australia/Adelaide"    "Australia/Darwin"     
[11] "Australia/Perth"       "Australia/Eucla" 
like image 191
IRTFM Avatar answered Sep 20 '22 15:09

IRTFM