I'm currently working in tick data with R and I would like to merge date and time into a single object as I need to get a precise time object to compute some statistics on my data. Here is how my data looks like:
date time price flag exchange 2 XXH10 2010-02-02 08:00:03 2787 1824 E 3 XXH10 2010-02-02 08:00:04 2786 3 E 4 XXH10 2010-02-02 08:00:04 2787 6 E 5 XXH10 2010-02-02 08:00:04 2787 1 E 6 XXH10 2010-02-02 08:00:04 2787 1 E
Basically, I would like to merge the columns "date" and "time" into a single one.
and. time() method in R can be used to merge together the date and time to obtain date-time object in POSIX format. Parameter : date – Date can be specified either in the Date format or in the form of character string referred by “YYYY-MM-DD”.
R provides several options for dealing with date and date/time data. The builtin as. Date function handles dates (without times); the contributed library chron handles dates and times, but does not control for time zones; and the POSIXct and POSIXlt classes allow for dates and times with control for time zones.
In addition to the time data types R also has a date data type. The difference is that the date data type keeps track of numbers of days rather than seconds. You can cast a string into a date type using the as. Date function.
Create a datetime
object with as.POSIXct
:
as.POSIXct(paste(x$date, x$time), format="%Y-%m-%d %H:%M:%S") [1] "2010-02-02 08:00:03 GMT" "2010-02-02 08:00:04 GMT" "2010-02-02 08:00:04 GMT" [4] "2010-02-02 08:00:04 GMT" "2010-02-02 08:00:04 GMT"
Of course, more elegant solution (arguably) is possible with extra package. When working with dates it's lubridate package:
library(lubridate) with(x, ymd(date) + hms(time))
should produce POSIXlt vector.
UPDATE:
There is another solution using general purpose date and time conversion package anytime
(based on C++ library Boost date_time):
library(anytime) with(x, anytime(paste(date, time)))
Indeed, comparing anytime
with both base R and lubridate
(deservedly considered rather slow - see Why are my functions on lubridate dates so slow?) C++ (anytime
) wins:
x = read.csv(text = 'date,time 2010-02-02,08:00:03 2010-02-02,08:00:04 2010-02-02,08:00:04 2010-02-03,08:00:04 2010-02-04,08:00:05 2010-02-04,08:00:05 2010-02-04,08:00:06 2010-02-04,08:00:07 2010-02-04,08:00:08 2010-02-04,08:00:14') microbenchmark::microbenchmark( base = with(x, as.POSIXct(paste(date, time), format="%Y-%m-%d %H:%M:%S")), anytime = with(x, anytime::anytime(paste(date, time))), lubri = with(x, lubridate::ymd(date) + lubridate::hms(time)), times = 1000L )
Unit: microseconds expr min lq mean median uq max neval base 71.163 91.2555 104.38747 104.785 112.1185 256.997 1000 anytime 40.508 52.5385 63.46973 61.843 68.5730 221.076 1000 lubri 1596.490 1850.4400 2235.34254 1909.588 2033.096 110751.622 1000
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With