Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between as.POSIXct/as.POSIXlt and strptime for converting character vectors to POSIXct/POSIXlt

I have followed a number of questions here that asks about how to convert character vectors to datetime classes. I often see 2 methods, the strptime and the as.POSIXct/as.POSIXlt methods. I looked at the 2 functions but am unclear what the difference is.

strptime

function (x, format, tz = "")  {     y <- .Internal(strptime(as.character(x), format, tz))     names(y$year) <- names(x)     y } <bytecode: 0x045fcea8> <environment: namespace:base> 

as.POSIXct

function (x, tz = "", ...)  UseMethod("as.POSIXct") <bytecode: 0x069efeb8> <environment: namespace:base> 

as.POSIXlt

function (x, tz = "", ...)  UseMethod("as.POSIXlt") <bytecode: 0x03ac029c> <environment: namespace:base> 

Doing a microbenchmark to see if there are performance differences:

library(microbenchmark) Dates <- sample(c(dates = format(seq(ISOdate(2010,1,1), by='day', length=365), format='%d-%m-%Y')), 5000, replace = TRUE) df <- microbenchmark(strptime(Dates, "%d-%m-%Y"), as.POSIXlt(Dates, format = "%d-%m-%Y"), times = 1000)  Unit: milliseconds                                     expr      min       lq   median       uq      max 1 as.POSIXlt(Dates, format = "%d-%m-%Y") 32.38596 33.81324 34.78487 35.52183 61.80171 2            strptime(Dates, "%d-%m-%Y") 31.73224 33.22964 34.20407 34.88167 52.12422 

strptime seems slightly faster. so what gives? why would there be 2 similar functions or are there differences between them that I missed?

like image 457
R J Avatar asked May 22 '12 09:05

R J


People also ask

What is the difference between POSIXct and POSIXlt and as date?

The builtin as. Date function handles dates (without times); the contributed library chron handles dates and times, but does not control for time zones; and the POSIXct and POSIXlt classes allow for dates and times with control for time zones.

What does as POSIXct do?

as. POSIXct stores both a date and time with an associated time zone. The default time zone selected, is the time zone that your computer is set to which is most often your local time zone. POSIXct stores date and time in seconds with the number of seconds beginning at 1 January 1970.

What does POSIXct stand for?

The basic POSIX measure of time, calendar time, is the number of seconds since the beginning of 1970, in the UTC timezone (GMT as described by the French).


1 Answers

Well, the functions do different things.

First, there are two internal implementations of date/time: POSIXct, which stores seconds since UNIX epoch (+some other data), and POSIXlt, which stores a list of day, month, year, hour, minute, second, etc.

strptime is a function to directly convert character vectors (of a variety of formats) to POSIXlt format.

as.POSIXlt converts a variety of data types to POSIXlt. It tries to be intelligent and do the sensible thing - in the case of character, it acts as a wrapper to strptime.

as.POSIXct converts a variety of data types to POSIXct. It also tries to be intelligent and do the sensible thing - in the case of character, it runs strptime first, then does the conversion from POSIXlt to POSIXct.

It makes sense that strptime is faster, because strptime only handles character input whilst the others try to determine which method to use from input type. It should also be a bit safer in that being handed unexpected data would just give an error, instead of trying to do the intelligent thing that might not be what you want.

like image 107
Fhnuzoag Avatar answered Oct 19 '22 17:10

Fhnuzoag