running into issues while plotting stock data in ggplot2 and with an x-axis that contains gaps from weekends and holidays. this post has been very helpful, but i run into a variety of issues when trying to use ordered factors.
library(xts)
library(grid)
library(dplyr)
library(scales)
library(bdscale)
library(ggplot2)
library(quantmod)
getSymbols("SPY", from = Sys.Date() - 1460, to = Sys.Date(), adjust = TRUE, auto.assign = TRUE)
input <- data.frame(SPY["2015/"])
names(input) <- c("Open", "High", "Low", "Close", "Volume", "Adjusted")
# i've tried changing rownames() to index(), and the plot looks good, but the x-axis is inaccurate
# i've also tried as.factor()
xaxis <- as.Date(rownames(input))
input$xaxis <- xaxis
p <- ggplot(input)
p <- p + geom_segment(aes(x = xaxis, xend = xaxis, y = Low, yend = High), size = 0.50) # body
p <- p + geom_segment(aes(x = xaxis - 0.4, xend = xaxis, y = Open, yend = Open), size = 0.90) # open
p <- p + geom_segment(aes(x = xaxis, xend = xaxis + 0.4, y = Close, yend = Close), size = 0.90) # close
p <- p + scale_y_continuous(scale_y_log10())
p + ggtitle("SPY: 2015")
The plot above (sans red boxes) is generated with the above code segment. And the following charts are some of the issues when attempting some solutions. First, if I try using the data frame's index, I will generate I nice looking graph, but the x-axis is inaccurate; the data currently ends in October, but in the plot below it ends in July:
xaxis <- as.Date(index(input))
Second, if I try coercing the rownames to an ordered factor, I lose my horizontal tick data (representing the open and the close).
xaxis <- factor(rownames(input), ordered = TRUE)
The same issue of removing the horizontal ticks happens if I use the package bdscale, but the gridlines are cleaner:
p <- p + scale_x_bd(business.dates = xaxis)
The method below uses faceting to remove spaces between missing dates, then removes white space between facets to recover the look of an unfaceted plot.
First, we create a grouping variable that increments each time there's a break in the dates (code adapted from this SO answer). We'll use this later for faceting.
input$group = c(0, cumsum(diff(input$xaxis) > 1))
Now we add the following code to your plot. facet_grid
creates a new facet at each location where there was a break in the date sequence due to a weekend or holiday. scale_x_date
adds major tick marks once per week and minor grid lines for each day, but you can adjust this. The theme
function gets rid of the facet strip labels and the vertical spaces between facets:
p + facet_grid(. ~ group, space="free_x", scales="free_x") +
scale_x_date(breaks=seq(as.Date("2015-01-01"),max(input$xaxis), "1 week"),
minor_breaks="1 day",
labels=date_format("%b %d, %Y")) +
theme(axis.text.x=element_text(angle=-90, hjust=0.5, vjust=0.5, size=11),
panel.margin = unit(-0.05, "lines"),
strip.text=element_text(size=0),
strip.background=element_rect(fill=NA)) +
ggtitle("SPY: 2015")
Here's the resulting plot. The spaces for weekends and holidays are gone. The major breaks mark each week. I set the weeks in thescale_x_date
breaks
argument to start on a Thursday since none of the holidays fell on a Thursday and therefore each facet has a major tick mark for the date. (In contrast, the default breaks would fall on a Monday. Since holidays often fall on a Monday, weeks with Monday holidays would not have a major tick mark with the default breaks.) Note, however, that the spacing between the major breaks inherently varies based on how many days the market was open that week.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With