I often run into the same issue of how to handle NA values when modelling quantitative trading models. The example below is about a stock with EOD data since 1997-01-01 stored in a xts object with four columns named "High","Low","Close","Volume". The data is from Bloomberg. When I want to calculate rolling 20-day volume the error message occurs:
SMA(stock$Volume, 20)
Error in runSum(x, n) : Series contains non-leading NAs
I quickly located the problem (which I knew was NA values since I have tried this a 1000 times) and found the two days where volume data is missing. I have reproduced those days' data below. As a quick observation the SMA
, EMA
etc. functions in TTR cannot handle NAs if they are preceded by numbers and followed by numbers.
stock <- as.xts(matrix(c(94.46,92.377,94.204,NA,71.501,70.457,70.979,NA), 2, 4,
byrow = TRUE, dimnames = list(NULL, c("High","Low","Close","Volume"))),
as.Date(c("1998-07-07", "1999-02-22")))
What is the best way to handle this issue? Is it to store the stock$Volume
as a temporary object where NA values are removed and then calculate the rolling volume and the merge it back in with merge.xts
while adding the fill = NA
so NA values are inserted again? But is that correct since you take the last 20 trading days and not just the 19 available in the 20-day window?
It is my hope that some sort of "best practice" can be the outcome of this post as I assume this issue also happens for other R-users in finance whether they get their data from Bloomberg, Yahoo Finance or another source.
I don't know about "best practice" but one alternative might be what are called "inhomogeneous time series operators", as presented in Operators on Inhomogeneous Time Series.
This type of question is a good fit for the Quantitative Finance stack exchange site (e.g. see How to update an exponential moving average with missing values?).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With