I'm learning R by analysing the results of a bike race and I'm having problems with the time data (how much a person took to finish the race).
The time data has the format "HH:MM:SS".
I tried converting it to posixct but it adds a date component to it. I also tried the chron package but it won't let me divide a number by a time object
One of the things I want to do is to calculate average speeds using this time, so I need to be able to divide distance by time.
The package chron
has classes to deal with times, and the function to use is, wait for it, times()
. Here is an example using typical times for running a standard marathon:
library(chron)
tms <- c("2:06:00", "3:34:30", "4:12:59")
x <- times(tms)
You now have a times
object, representing fractions of a day.
str(x)
Class 'times' atomic [1:3] 0.0875 0.149 0.1757
..- attr(*, "format")= chr "h:m:s"
You can perform speed calculations, but you will need to convert the class from dates
to numeric with as.numeric
.
dist <- 42.2
as.numeric(dist/x/24)
[1] 20.09524 11.80420 10.00856
And there you have it: speeds in km/h.
I would use POSIXct
for which you have by far the strongest support in base R, and add-on packages.
Whenever I use intra-daily data for which the day does not matter, I just add a base date of, say, January 1st of the current year. For all comparisons, differences, etc this washes out.
Also of note: as.numeric()
of a POSIXct
variable gets you back to normal numbers (of seconds.subseconds since the epoch) which is handy for both arithmetic and in case you need to store (in a db without datetime), or transfer to another system or languages. Everybody has floating point---and (fractional) seconds since epoch is easy. POSIXct
gives you added benefits for formatting, sequences, differences, plotting, ...
Here is a little example:
R> txt <- c("08:09:10", "09:10:11", "10:11:12", "11:12:13")
R> times <- as.POSIXct(paste("2013-01-01", txt))
R> times
[1] "2013-01-01 08:09:10 CST" "2013-01-01 09:10:11 CST"
+ "2013-01-01 10:11:12 CST" "2013-01-01 11:12:13 CST"
R> times - times[1]
Time differences in secs
[1] 0 3661 7322 10983
attr(,"tzone")
[1] ""
R> as.numeric(times - times[1])
[1] 0 3661 7322 10983
R>
What you are looking at is not really time, but an elapsed time. There are data types for elapsed time. In base R, the difftime
class does this.
tms <- c("2:06:00", "3:34:30", "4:12:59", "08:09:10",
"09:10:11", "10:11:12", "11:12:13")
ta <- as.difftime(tms)
which displays as
> ta
Time differences in hours
[1] 2.100000 3.575000 4.216389 8.152778 9.169722 10.186667 11.203611
attr(,"tzone")
[1] ""
> format(ta)
[1] " 2.100000 hours" " 3.575000 hours" " 4.216389 hours" " 8.152778 hours" " 9.169722 hours"
[6] "10.186667 hours" "11.203611 hours"
You can do math with this as well by converting to numeric.
> 42.2/as.numeric(ta)
[1] 20.095238 11.804196 10.008564 5.176150 4.602102 4.142670 3.766643
The lubridate
package also has types that deal with elapsed time, specifically duration
.
library("lubridate")
ti <- as.duration(as.difftime(tms))
which displays as
> ti
[1] 7560s (~2.1 hours) 12870s (~3.58 hours) 15179s (~4.22 hours) 29350s (~8.15 hours)
[5] 33011s (~9.17 hours) 36672s (~10.19 hours) 40333s (~11.2 hours)
and you can do math with is after converting to numeric (here, seconds rather than hours)
> 42.2/as.numeric(ti)
[1] 0.005582011 0.003278943 0.002780157 0.001437819 0.001278362 0.001150742 0.001046290
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With