I have a data with more that 3 million records having start.time and end.time as two of the variables. The first 10 obs are as follows:
start.date start.time end.date end.time 1 2012-07-13 15:01:32 2012-07-13 15:02:42 2 2012-07-05 18:26:31 2012-07-05 18:27:19 3 2012-07-14 20:23:21 2012-07-14 20:24:11 4 2012-07-29 16:09:54 2012-07-29 16:10:48 5 2012-07-21 14:58:32 2012-07-21 15:00:17 6 2012-07-04 15:36:31 2012-07-04 15:37:11 7 2012-07-22 18:28:31 2012-07-22 18:28:50 8 2012-07-09 21:08:42 2012-07-09 21:09:02 9 2012-07-05 09:44:52 2012-07-05 09:45:05 10 2012-07-02 18:50:47 2012-07-02 18:51:38
I need to calculate the difference between start.time and end.time.
I used the following code:
mbehave11$diff.time <- difftime(mbehave11$end.time, mbehave11$start.time, units="secs")
But I am getting this error:
Error in as.POSIXlt.character(x, tz, ...) : character string is not in a standard unambiguous format In addition: Warning messages: 1: In is.na.POSIXlt(strptime(xx, f <- "%Y-%m-%d %H:%M:%OS", tz = tz)) : Reached total allocation of 1535Mb: see help(memory.size)
The difftime R function calculates the time difference of two date or time objects.
To calculate the time difference in seconds, you need to multiply the resulting value by the total number of seconds in a day (which is or 24*60*60 or 86400). Suppose you have a data set as shown below and you want to calculate the total number of seconds that have elapsed between the start and end date.
Converting a date and time into an epoch value makes it easier to find the difference, add, and subtract from a time value. For example, you could convert the time to an epoch and subtract it from another epoch value to quickly determine the difference.
You must turn your strings into date objects before you can do date/time arithmetic. Try this:
a) Reading your data:
R> dat <- read.table(textConnection("start.date start.time end.date end.time 2012-07-13 15:01:32 2012-07-13 15:02:42 2012-07-05 18:26:31 2012-07-05 18:27:19 2012-07-14 20:23:21 2012-07-14 20:24:11"), header=TRUE)
b) Working on one observation:
R> strptime( paste(dat[,1], dat[,2]), "%Y-%m-%d %H:%M:%S") [1] "2012-07-13 15:01:32" "2012-07-05 18:26:31" "2012-07-14 20:23:21"
c) Working on the set, converting to numeric:
R> as.numeric(difftime(strptime(paste(dat[,1],dat[,2]),"%Y-%m-%d %H:%M:%S"), strptime(paste(dat[,3],dat[,4]),"%Y-%m-%d %H:%M:%S"))) [1] -70 -48 -50 R>
Edit Some seven years later by someone else below.
d) Just to explain the results -70 -48 -50
above take a look at the example row by row:
[2012-07-13 15:01:32] - [2012-07-13 15:02:42] = -70 seconds, [2012-07-05 18:26:31] - [2012-07-05 18:27:19] = -48 seconds, [2012-07-14 20:23:21] - [2012-07-14 20:24:11] = -50 seconds
I think you can use the lubridate package
it has a method called ymd_hms
you can use that to get the time from string: it is much faster for large data set
library(lubridate) dat <- read.table(textConnection("start.date start.time end.date end.time 2012-07-13 15:01:32 2012-07-13 15:02:42 2012-07-05 18:26:31 2012-07-05 18:27:19 2012-07-14 20:23:21 2012-07-14 20:24:11"), header=TRUE) starttime = ymd_hms(paste(dat[,1], dat[,2])) endtime = ymd_hms(paste(dat[,3], dat[,4])) interval = difftime(endtime,starttime,units = "secs")
or you can just do it in one line, but it takes longer time for big dataset:
difftime(paste(dat[,3], dat[,4]),paste(dat[,1], dat[,2]),units = "secs")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With