Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R tick data : merging date and time into a single object

I'm currently working in tick data with R and I would like to merge date and time into a single object as I need to get a precise time object to compute some statistics on my data. Here is how my data looks like:

               date       time      price flag    exchange 2   XXH10   2010-02-02   08:00:03   2787 1824        E 3   XXH10   2010-02-02   08:00:04   2786    3        E 4   XXH10   2010-02-02   08:00:04   2787    6        E 5   XXH10   2010-02-02   08:00:04   2787    1        E 6   XXH10   2010-02-02   08:00:04   2787    1        E 

Basically, I would like to merge the columns "date" and "time" into a single one.

like image 992
marino89 Avatar asked Jul 23 '12 08:07

marino89


People also ask

How do I combine date and time in R?

and. time() method in R can be used to merge together the date and time to obtain date-time object in POSIX format. Parameter : date – Date can be specified either in the Date format or in the form of character string referred by “YYYY-MM-DD”.

How does R handle date and time data?

R provides several options for dealing with date and date/time data. The builtin as. Date function handles dates (without times); the contributed library chron handles dates and times, but does not control for time zones; and the POSIXct and POSIXlt classes allow for dates and times with control for time zones.

Does R have a time data type?

In addition to the time data types R also has a date data type. The difference is that the date data type keeps track of numbers of days rather than seconds. You can cast a string into a date type using the as. Date function.


2 Answers

Create a datetime object with as.POSIXct:

as.POSIXct(paste(x$date, x$time), format="%Y-%m-%d %H:%M:%S") [1] "2010-02-02 08:00:03 GMT" "2010-02-02 08:00:04 GMT" "2010-02-02 08:00:04 GMT" [4] "2010-02-02 08:00:04 GMT" "2010-02-02 08:00:04 GMT" 
like image 149
Andrie Avatar answered Oct 06 '22 11:10

Andrie


Of course, more elegant solution (arguably) is possible with extra package. When working with dates it's lubridate package:

library(lubridate)  with(x, ymd(date) + hms(time)) 

should produce POSIXlt vector.

UPDATE:

There is another solution using general purpose date and time conversion package anytime (based on C++ library Boost date_time):

 library(anytime)   with(x, anytime(paste(date, time))) 

Indeed, comparing anytime with both base R and lubridate (deservedly considered rather slow - see Why are my functions on lubridate dates so slow?) C++ (anytime) wins:

 x = read.csv(text = 'date,time 2010-02-02,08:00:03 2010-02-02,08:00:04 2010-02-02,08:00:04 2010-02-03,08:00:04 2010-02-04,08:00:05 2010-02-04,08:00:05 2010-02-04,08:00:06 2010-02-04,08:00:07 2010-02-04,08:00:08 2010-02-04,08:00:14')   microbenchmark::microbenchmark(    base = with(x, as.POSIXct(paste(date, time), format="%Y-%m-%d %H:%M:%S")),    anytime = with(x, anytime::anytime(paste(date, time))),    lubri = with(x, lubridate::ymd(date) + lubridate::hms(time)),    times = 1000L ) 
Unit: microseconds   expr      min        lq       mean   median        uq        max  neval  base       71.163   91.2555   104.38747  104.785  112.1185   256.997  1000  anytime    40.508   52.5385   63.46973   61.843   68.5730    221.076  1000  lubri      1596.490 1850.4400 2235.34254 1909.588 2033.096   110751.622  1000 
like image 26
topchef Avatar answered Oct 06 '22 13:10

topchef