Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to plot a sparse (with gaps) line with its segments colored according to some factor in R?

Tags:

r

ggplot2

lattice

I have a data.frame with time series. There are also NAs in it as well as there is a factor that I'd like to use to highlight different segments of a line.

flow.mndnr <- function(id, start, end) {
  uri <- sprintf("http://maps1.dnr.state.mn.us/cgi-bin/csg.pl?mode=dump_hydro_data_as_csv&site=%s&startdate=%s&enddate=%s", id, start, end)
  dat <- read.csv(url(uri), colClasses=c(Timestamp="Date"))
  rng <- range(dat$Timestamp)
  d <- data.frame(Timestamp=seq(rng[1], rng[2], by='day'))
  merge(d, dat, all.x=TRUE)
}
dat <- flow.mndnr("28062001", as.Date("2002-04-02"), as.Date("2011-10-05"))

I can plot it unconditionally

library(lattice)
xyplot(Discharge..cfs. ~ Timestamp, dat, type='l', cex=0.5, auto.key=TRUE)

enter image description here

But I can't get rid of connecting lines when I try to introduce factor

xyplot(Discharge..cfs. ~ Timestamp, dat, type='l',
    groups=dat$Discharge..cfs..Quality, cex=0.5, auto.key=TRUE)

enter image description here

Same with ggplot2

dat$quality <- dat$Discharge..cfs..Quality
ggplot(dat, aes(x=Timestamp, y=Discharge..cfs.)) +
  geom_path(aes(colour=quality)) + theme(legend.position='bottom')

enter image description here

I tried geom_line with no success. I read in ggplot2 mailing archive that geom_path is the way to go. But it does not quite work for me.

P.S. Why ggplot2 does not like dots in a name so I had to use another one?

like image 570
mlt Avatar asked Nov 03 '22 03:11

mlt


1 Answers

The problem is with the grouping. You can use the year to skip these jumps. Just do:

dat$grp <- format(dat$Timestamp, "%Y")
ggplot(dat, aes(x=Timestamp, y=Discharge..cfs.)) +
    geom_path(aes(colour = quality, group = grp)) + 
    theme(legend.position='bottom')

You get:

enter image description here

Edit: To answer the comment in detail: As long as you don't know which variable to group by, you can not group properly. If you have some months missing within the year, of course this code will produce jumps. In that case, I suggest doing something like this:

dat$grp <- paste(format(dat$Timestamp, "%Y"), format(dat$Timestamp, "%m"))
ggplot(dat, aes(x=Timestamp, y=Discharge..cfs.)) +
    geom_path(aes(colour = quality, group = grp)) + 
    theme(legend.position='bottom')

You get this:

enter image description here

like image 85
Arun Avatar answered Nov 09 '22 11:11

Arun