Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: How do I change gaps (holidays) in a time series of a daily index of the stock exchange by the previous day's information?

I'm usin R language and working with time series daily stock index from differents countries. In order to make comparisons between of differents indexes,(like correletaion, causality etc..) I need that all the series have the same number of lines, but because diferents holidays in diferents countries, the number of lines in each series change.

I'm working with extracted files from yahoo finance, with format .csv, like...

> head(sp)
>           Date    Open    High     Low   Close     Volume Adj.Close
>1288 2010-01-04 1116.56 1133.87 1116.56 1132.99 3991400000   1132.99
>1287 2010-01-05 1132.66 1136.63 1129.66 1136.52 2491020000   1136.52
>1286 2010-01-06 1135.71 1139.19 1133.95 1137.14 4972660000   1137.14

I need... for example, suppose that day 2010-01-07 is a holiday, in this case, the next line (line 1285) in the file is the day 2010-01-08:

> head(sp)
>           Date    Open    High     Low   Close     Volume Adj.Close
>1288 2010-01-04 1116.56 1133.87 1116.56 1132.99 3991400000   1132.99
>1287 2010-01-05 1132.66 1136.63 1129.66 1136.52 2491020000   1136.52
>1286 2010-01-06 1135.71 1139.19 1133.95 1137.14 4972660000   1137.14
>1285 2010-01-08 1140.52 1145.39 1136.22 1144.98 4389590000   1144.98

In need fill the gap in 2010-01-07 with the previus day data, like :

> head(sp)
>           Date    Open    High     Low   Close     Volume Adj.Close
>1288 2010-01-04 1116.56 1133.87 1116.56 1132.99 3991400000   1132.99
>1287 2010-01-05 1132.66 1136.63 1129.66 1136.52 2491020000   1136.52
>1286 2010-01-06 1135.71 1139.19 1133.95 1137.14 4972660000   1137.14
>1285 2010-01-07 1135.71 1139.19 1133.95 1137.14 4972660000   1137.14
>1284 2010-01-08 1140.52 1145.39 1136.22 1144.98 4389590000   1144.98

How I can do this ???

My code is (look all the library that I tried using for solve my problem kkk)

>library(PerformanceAnalytics)
>library(tseries)
>library(urca)
>library(zoo)
>library(lmtest)
>library(timeDate)
>library(timeSeries)

>setwd("C:/Users/Fatima/Documents/R")

>sp = read.csv("SP500.csv", header = TRUE, stringsAsFactors = FALSE)
>sp$Date = as.Date(sp$Date)
>sp = sp[order(sp$Date), ]

Sorry about my bad english

like image 279
FlávioCorinthians Avatar asked Mar 19 '15 13:03

FlávioCorinthians


1 Answers

Package xts is useful here:

DF <- read.table(text = "           Date    Open    High     Low   Close     Volume Adj.Close
1288 2010-01-04 1116.56 1133.87 1116.56 1132.99 3991400000   1132.99
1287 2010-01-05 1132.66 1136.63 1129.66 1136.52 2491020000   1136.52
1286 2010-01-06 1135.71 1139.19 1133.95 1137.14 4972660000   1137.14
1285 2010-01-08 1140.52 1145.39 1136.22 1144.98 4389590000   1144.98", header = TRUE)

DF$Date <- as.Date(DF$Date)

library(xts)
X <- as.xts(DF[,-1], order.by = DF$Date)
na.locf(merge(X, seq(min(DF$Date), max(DF$Date), by = 1)))
#              Open    High     Low   Close     Volume Adj.Close
#2010-01-04 1116.56 1133.87 1116.56 1132.99 3991400000   1132.99
#2010-01-05 1132.66 1136.63 1129.66 1136.52 2491020000   1136.52
#2010-01-06 1135.71 1139.19 1133.95 1137.14 4972660000   1137.14
#2010-01-07 1135.71 1139.19 1133.95 1137.14 4972660000   1137.14
#2010-01-08 1140.52 1145.39 1136.22 1144.98 4389590000   1144.98

Edit:

In response to your comment: You can exclude weekends like this:

dates <- seq(min(DF$Date), max(DF$Date), by = 1)
#you might have to adjust the following to the translations in your locale
dates <- dates[!(weekdays(dates) %in% c("Saturday", "Sunday"))]
na.locf(merge(X, dates))
like image 75
Roland Avatar answered Sep 25 '22 04:09

Roland