Hi I am looking to subset some minutely data by time. I normally use xts
doing something like:
subset.string <- 'T10:00/T13:00'
xts.min.obj[subset.string]
to get all the rows which are between 10am and 1pm (inclusive) EACH DAY and have the output as an xts format. But is a bit slow for my purposes...e.g
j <- xts(rnorm(10e6),Sys.time()-(10e6:1))
system.time(j['T10:00/T16:00'])
user system elapsed
5.704 0.577 17.115
I know that data.table
is v fast and at subsetting large datasets so am wondering if in conjunction with the fasttime
package to deal with fast POSIXct creations, if it would be worth it to create a function like
dt.time.subset <- function(xts.min.obj, subset.string){
require(data.table)
require(fasttime)
x.dt <- data.table(ts=format(index(xts.min.obj),"%Y-%m-%d %H:%M:%S %Z"),
coredata(xts.min.obj))
out <- x.dt[,some.subsetting.operation.using."%between%"]
xts(out,fastPOSIXct(out[,ts])
}
to convert the xts.min.obj into a data.table add some sort of character index and then use data.table to subset the relevant rows use the output row index with fasttime to recreate an xts output? or is this too many excess operations for something that is already highly optimised and written in C?
xts makes it easy to join data by column and row using a few different functions. xts objects must be of identical type (e.g. integer + integer), or be POSIXct dates vector, or be atomic vectors of the same type (e.g. numeric), or be a single NA.
eXtensible Time Series (xts) is a powerful package that provides an extensible time series class, enabling uniform handling of many R time series classes by extending zoo.
If you're ok with specifying your range in UTC
, you can do:
j[(.index(j) %% 86400) %between% c(10*3600, 16*3600 + 60)]
# +60 because xts includes that minute; you'll need to offset the times
# appropriately to match with xts unless you live in UTC :)
j <- xts(rnorm(10e6),Sys.time()-(10e6:1))
system.time(j[(.index(j) %% 86400) %between% c(10*3600, 16*3600 + 60)])
# user system elapsed
# 1.17 0.08 1.25
# likely faster on your machine as mine takes minutes to run the OP bench
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With