Applying the split
function to a xts
object by weeks
groups rows into weekly chunks. The default days in the group are Monday
to Sunday
. What do I do if I want the days in the group to be from Sunday
to Saturday
?
library(xts)
idx <- as.Date("2018-3-1") + 0:14
v <- 1:15
x <- xts(v, idx)
group <- split(x, f = 'weeks')
group
Output:
[[1]]
[,1]
2018-03-01 1 # Thursday
2018-03-02 2 # Friday
2018-03-03 3 # Saturday
2018-03-04 4 # Sunday
[[2]]
[,1]
2018-03-05 5 # Monday
2018-03-06 6 # Tuesday
2018-03-07 7 # Wednesday
2018-03-08 8 # Thursday
2018-03-09 9 # Friday
2018-03-10 10 # Saturday
2018-03-11 11 # Sunday
[[3]]
[,1]
2018-03-12 12 # Monday
2018-03-13 13 # Tuesday
2018-03-14 14 # Wednesday
2018-03-15 15 # Thursday
Desired Output:
[[1]]
[,1]
2018-03-01 1 # Thursday
2018-03-02 2 # Friday
2018-03-03 3 # Saturday
[[2]]
[,1]
2018-03-04 4 # Sunday
2018-03-05 5 # Monday
2018-03-06 6 # Tuesday
2018-03-07 7 # Wednesday
2018-03-08 8 # Thursday
2018-03-09 9 # Friday
2018-03-10 10 # Saturday
[[3]]
[,1]
2018-03-11 11 # Sunday
2018-03-12 12 # Monday
2018-03-13 13 # Tuesday
2018-03-14 14 # Wednesday
2018-03-15 15 # Thursday
I split by weeks on Sundays rather than Mondays frequently, because I work with FX data (with markets opening on Sunday afternoon New York EST). Here is an efficient solution, split_FXweeks
, using the "xts way" of splitting time series data. This approach is quite fast when you're working with high density tick data over long periods of time.
Credit for this trick is due to trick 1 in the below link: http://darrendev.blogspot.com.au/2012/08/small-rxts-code-snippets-and-tips.html
Added a benchmark comparing to other suggested approaches as a baseline.
idx <- as.Date("2018-3-1") + 0:14
v <- 1:15
x <- xts(v, idx)
split_FXweeks <- function(x) {
ep <- .Call("endpoints", .index(x) + 4L * 86400L, 604800L,
1, TRUE, PACKAGE = "xts")
sp <- (ep + 1)[-length(ep)]
ep <- ep[-1]
lapply(1:length(ep), function(X) x[sp[X]:ep[X]])
}
split1 <- function(idx, x) {
week_num <- format(idx, "%U")
group <- unname(split(x, f = week_num))
group
}
library(microbenchmark)
microbenchmark(
y <- split_FXweeks(x),
z <- split1(idx, x))
# Unit: microseconds
# expr min lq mean median uq max neval
# y <- split_FXweeks(x) 52.521 60.167 72.90766 75.2390 80.6495 162.077 100
# z <- split1(idx, x) 325.681 351.658 383.13293 364.2215 384.9765 881.486 100
# > y
# [[1]]
# [,1]
# 2018-03-01 1
# 2018-03-02 2
# 2018-03-03 3
#
# [[2]]
# [,1]
# 2018-03-04 4
# 2018-03-05 5
# 2018-03-06 6
# 2018-03-07 7
# 2018-03-08 8
# 2018-03-09 9
# 2018-03-10 10
#
# [[3]]
# [,1]
# 2018-03-11 11
# 2018-03-12 12
# 2018-03-13 13
# 2018-03-14 14
# 2018-03-15 15
Consider creating an external, equal-length vector of Week Number with %U
format for weekdays starting on Sunday. See ?strftime
.
%U
Week of the year as decimal number (00–53) using Sunday as the first day 1 of the week (and typically with the first Sunday of the year as day 1 of week 1). The US convention.
week_num <- format(idx, "%U")
group <- unname(split(x, f = week_num))
group
[[1]]
2018-03-01 1
2018-03-02 2
2018-03-03 3
[[2]]
2018-03-04 4
2018-03-05 5
2018-03-06 6
2018-03-07 7
2018-03-08 8
2018-03-09 9
2018-03-10 10
[[3]]
2018-03-11 11
2018-03-12 12
2018-03-13 13
2018-03-14 14
2018-03-15 15
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With