Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R - ggplot2 - geom_line - Get rid of straight line for missing values

Tags:

r

ggplot2

I have data that I am trying to plot. I have several variables that range from the years 1880-2012. I have one observation per year. But sometimes a variable does not have an observation for a number of years. For example, it may have an observation from 1880-1888, but then not from 1889-1955 and then from 1956-2012. I would like ggplot2 + geom_line to not have anything in the missing years (1889-1955). But it connects 1888 and 1956 with a straight line. Is there anything I can do to remove this line? I am using the ggplot function.

Unrelated question, but is there a way to get ggplot to not sort my variable names in the legend alphabetically? I have code like this:

ggplot(dataFrame, aes(Year, value, colour=Name)) + geom_line()

Or to add numbers in front of the variable names (Name1, ..., Name10) to the legend. For example, 1. Name1 2. Name2 ... 10. Name10

like image 987
bill999 Avatar asked Sep 29 '13 04:09

bill999


1 Answers

Here's some sample data to answer your questions, I've added the geom_point() function to make it easier to see which values are in the data:

library(ggplot2)
seed(1234)
dat <- data.frame(Year=rep(2000:2013,5),
            value=rep(1:5,each=14)+rnorm(5*14,0,.5),
            Name=rep(c("Name1","End","First","Name2","Name 3"),each=14))
dat2 <- dat
dat2$value[sample.int(5*14,12)]=NA

dat3 is probably the example of what your data looks like except that I'm treating Year as an integer.

dat3 <- dat2[!is.na(dat2$value),]

# POINTS ARE CONNECTED WITH NO DATA IN BETWEEN #
ggplot(dat3, aes(Year, value, colour=Name)) + 
     geom_line() + geom_point()

However if you add columns in your data for the years that are missing a column and setting that value to NA then when you plot the data you'll get the gaps.

# POINTS ARE NOT CONNECTED #
ggplot(dat2, aes(Year, value, colour=Name)) + 
     geom_line() + geom_point()

And finally, to answer your last question this is how you change the order and labels of Name in the legend:

# CHANGE THE ORDER AND LABELS IN THE LEGEND #
ggplot(dat2, aes(Year, value, colour=Name)) + 
     geom_line() + geom_point() + 
     scale_colour_discrete(labels=c("Beginning","Name 1","Name 2","Name 3","End"),
                             breaks=c("First","Name1","Name2","Name 3","End"))
like image 103
Mark Nielsen Avatar answered Nov 14 '22 22:11

Mark Nielsen