Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mixing line and scatterplot in ggplot

Tags:

r

ggplot2

I've looked around a fair bit but I am stumped on this one. I cannot find a way to plot a line that is unassociated with a scatterplot. Here is some of my data and code to clarify the problem. I have data of the following form

> head(allData)
  AnnounceDate MarketProbability DealStatus binary BrierScore
1   2000-04-10         0.3333333   Complete      1  0.2340565
2   2000-06-14         0.2142857   Complete      1  0.3618200
3   2000-06-26         0.6846154   Complete      1  0.3690167
4   2000-06-16         0.1875000   Complete      1  0.4364041
5   2000-10-05         0.9555556   Complete      1  0.3078432
6   2000-10-19         0.8500000   Complete      1  0.2670799

I would like to create a scatter plot of the MarketProbabilities vs. the AnnounceDate, and identify whether a DealStatus is Completed or Terminated using colour.

(ggplot(data=allData, aes(x=AnnounceDate, y=MarketProbability, colour=DealStatus))
 + geom_point() + scale_colour_hue(h = c(180,0)))

enter image description here

I would also like to overlay a rolling Brier score, I have tried

(ggplot(data=allData, aes(x=AnnounceDate, y=MarketProbability, colour=DealStatus))
 + geom_point() + scale_colour_hue(h=c(180,0))
 + geom_line(aes(x=AnnounceDate, y=BrierScore)))

enter image description here

I am confused why two coloured lines are plotted? How do I dissociate the line with the previous plotted data?

like image 729
mgilbert Avatar asked Dec 01 '12 03:12

mgilbert


People also ask

How do I add a line to a scatterplot in R?

A scatter plot can be created using the function plot(x, y). The function lm() will be used to fit linear models between y and x. A regression line will be added on the plot using the function abline(), which takes the output of lm() as an argument. You can also add a smoothing line using the function loess().

How do I overlay two plots in R?

To overlay a line plot in the R language, we use the lines() function. The lines() function is a generic function that overlays a line plot by taking coordinates from a data frame and joining the corresponding points with line segments.

How do I add a regression line to a ggplot scatter plot?

One of the easiest methods to add a regression line to a scatter plot with ggplot2 is to use geom_smooth(), by adding it as additional later to the scatter plot. To make a linear regression line, we specify the method to use to be “lm”.

What does %>% do in ggplot?

%>% is a pipe operator reexported from the magrittr package. Start by reading the vignette. Adding things to a ggplot changes the object that gets created. The print method of ggplot draws an appropriate plot depending upon the contents of the variable.


1 Answers

As @MattBagg has pointed out, this issue is fixed by moving the aes() for colour out of the main ggplot() call and into the geom_point() call.

library(ggplot2)

allData = read.table(header=TRUE,
colClasses=c("Date", "numeric", "character", "numeric", "numeric"),
text="AnnounceDate MarketProbability DealStatus binary BrierScore
2000-04-10         0.3333333   Complete      1  0.2340565
2000-06-14         0.2142857   Complete      1  0.3618200
2000-06-26         0.6846154   Complete      1  0.3690167
2000-06-16         0.1875000   Complete      1  0.4364041
2000-10-05         0.9555556   Complete      1  0.3078432
2000-10-19         0.8500000   Complete      1  0.2670799")

p1 = ggplot(data=allData, aes(x=AnnounceDate)) +
     geom_point(aes(y=MarketProbability, colour=DealStatus)) + 
     scale_colour_hue(h = c(180,0)) + 
     geom_line(aes(y=BrierScore))

ggsave(filename="plot_1.png", plot=p1, height=2.5, width=5)

enter image description here

like image 115
bdemarest Avatar answered Sep 27 '22 16:09

bdemarest