Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

organize text on geom_point using geom_text

Tags:

r

ggplot2

I have a line plot with some time points that are hard to distinguish by the coloring alone and I would therefore like to label the time points on the plot, but the labels overlap (see plot below) in a way where it is hard to read the labels.

The plot currently look like this,

current plot

I wonder if there is a way to 'stack' the labels or some way (script) that can ensure they do not overlap. Something like this,

 - - >

Any help would be appreciated.

Here is the code I used to produce the plot,

 require(ggplot2)
 require(plyr)
 require(reshape)

# create sample data
set.seed(666)
dfn <- data.frame(
Referral  = seq(as.Date("2007-01-15"), len= 26, by="23 day"),
VISIT01  = seq(as.Date("2008-06-15"), len= 24, by="15 day")[sample(30, 26)],
VISIT02  = seq(as.Date("2008-12-15"), len= 24, by="15 day")[sample(30, 26)],
VISIT03  = seq(as.Date("2009-01-01"), len= 24, by="15 day")[sample(30, 26)],
VISIT04  = seq(as.Date("2009-03-30"), len= 24, by="60 day")[sample(30, 26)],
VISIT05  = seq(as.Date("2010-11-30"), len= 24, by="6 day")[sample(30, 26)],
VISIT06  = seq(as.Date("2011-01-30"), len= 24, by="6 day")[sample(30, 26)],
Discharge = seq(as.Date("2012-03-30"), len= 24, by="30 day")[sample(30, 26)],
Patient  = factor(1:26, labels = LETTERS),
openCase  = rep(0:1, 100)[sample(100, 26)])

 # set today's data for cases that do not have an Discharge date
 dfn$Discharge[ is.na(dfn$Discharge) ] <- as.Date("2014-01-30")

 mdfn <- melt(dfn, id=c('Patient', 'openCase'), variable_name = "Visit")
 names(mdfn)[4] <- 'Year' # rename 

 # order data in mdfn by 'Referral' in dfn
 mdfn$Patient <- factor(mdfn$Patient,levels = 
 (dfn$Patient[order(dfn$Referral)]),ordered = TRUE)

 # subset a dataset to avoid 'Discharge' for cases that are not closed 
 mdfn2 <- subset(mdfn,!(Visit=="Discharge" & Year > as.Date("2014-01-01")))

 # the plot as it looks now
 ggplot(mdfn, aes(Year, Patient)) +
     geom_blank() +
     geom_line(data = mdfn[mdfn$openCase == 0,], colour = "black") +
     geom_line(data = mdfn[mdfn$openCase == 1,], colour = "grey") +
     geom_point(data = mdfn2, aes(colour = Visit), size = 4, shape = 124) + 
     geom_text(data=mdfn2, mapping=aes(x=Year, y=Patient, 
     label=substr(Visit, 1, 7), colour=Visit), size=2, 
     vjust=-.4, hjust=-.1, angle = 00) 
like image 450
Eric Fail Avatar asked May 07 '12 04:05

Eric Fail


1 Answers

You can change the vertical location of the label according to the numeric value of Visit.

The key is:

 y=(as.numeric(Patient)+0.25*as.numeric(Visit)%%3)-0.12

This currently produces:
3 different levels according to values of Visit (%%3), which you can increase or decrease
each level is separated by a quarter of the distance between y labels (0.25)
the first label is 0.12 below the horizontal line
the second is 0.12 above

enter image description hereenter image description here

require(ggplot2)
require(plyr)
require(reshape)
# create sample data
set.seed(666)
dfn <- data.frame(
  Referral  = seq(as.Date("2007-01-15"), len= 26, by="23 day"),
  VISIT01  = seq(as.Date("2008-06-15"), len= 24, by="15 day")[sample(30, 26)],
  VISIT02  = seq(as.Date("2008-12-15"), len= 24, by="15 day")[sample(30, 26)],
  VISIT03  = seq(as.Date("2009-01-01"), len= 24, by="15 day")[sample(30, 26)],
  VISIT04  = seq(as.Date("2009-03-30"), len= 24, by="60 day")[sample(30, 26)],
  VISIT05  = seq(as.Date("2010-11-30"), len= 24, by="6 day")[sample(30, 26)],
  VISIT06  = seq(as.Date("2011-01-30"), len= 24, by="6 day")[sample(30, 26)],
  Discharge = seq(as.Date("2012-03-30"), len= 24, by="30 day")[sample(30, 26)],
  Patient  = factor(1:26, labels = LETTERS),
  openCase  = rep(0:1, 100)[sample(100, 26)])

# set today's data for cases that do not have an Discharge date
dfn$Discharge[ is.na(dfn$Discharge) ] <- as.Date("2014-01-30")

mdfn <- melt(dfn, id=c('Patient', 'openCase'), variable_name = "Visit")
names(mdfn)[4] <- 'Year' # rename 

# order data in mdfn by 'Referral' in dfn
mdfn$Patient <- factor(mdfn$Patient,levels = 
  (dfn$Patient[order(dfn$Referral)]),ordered = TRUE)

# subset a dataset to avoid 'Discharge' for cases that are not closed 
mdfn2 <- subset(mdfn,!(Visit=="Discharge" & Year > as.Date("2014-01-01")))

# the plot as it looks now
ggplot(mdfn, aes(Year, Patient)) +
  geom_blank() +
  geom_line(data = mdfn[mdfn$openCase == 0,], colour = "black") +
  geom_line(data = mdfn[mdfn$openCase == 1,], colour = "grey") +
  geom_point(data = mdfn2, aes(colour = Visit), size = 4, shape = 124) + 
  geom_text(data=mdfn2, mapping=aes(x=Year, y=(as.numeric(Patient)+0.25*as.numeric(Visit)%%3)-0.12, 
                                    label=substr(Visit, 1, 7), colour=Visit), size=2, 
            hjust=-.1, angle = 00)
like image 130
Etienne Low-Décarie Avatar answered Nov 09 '22 19:11

Etienne Low-Décarie