Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

avoiding over-crowding of labels in r graphs

Tags:

plot

r

ggplot2

I am working on avoid over crowding of the labels in the following plot:

set.seed(123)
position <- c(rep (0,5), rnorm (5,1,0.1), rnorm (10, 3,0.1), rnorm (3, 4, 0.2), 5, rep(7,5), rnorm (3, 8,2),  rnorm (10,9,0.5),
               rep (0,5), rnorm (5,1,0.1), rnorm (10, 3,0.1), rnorm (3, 4, 0.2), 5, rep(7,5), rnorm (3, 8,2),  rnorm (10,9,0.5))
group <- c(rep (1, length (position)/2),rep (2, length (position)/2)  )
mylab <- paste ("MR", 1:length (group), sep = "")
barheight <- 0.5

y.start <- c(group-barheight/2)
y.end <- c(group+barheight/2)
mydf <- data.frame (position, group, barheight, y.start, y.end, mylab)


plot(0,type="n",ylim=c(0,3),xlim=c(0,10),axes=F,ylab="",xlab="")
#Create two horizontal lines
require(fields)
yline(1,lwd=4)
yline(2,lwd=4)
#Create text for the lines
text(10,1.1,"Group 1",cex=0.7)
text(10,2.1,"Group 2",cex=0.7)
#Draw vertical bars
lng = length(position)/2
lg1 = lng+1
lg2 = lng*2
segments(mydf$position[1:lng],mydf$y.start[1:lng],y1=mydf$y.end[1:lng])
segments(mydf$position[lg1:lg2],mydf$y.start[lg1:lg2],y1=mydf$y.end[lg1:lg2])
text(mydf$position[1:lng],mydf$y.start[1:lng]+0.65, mydf$mylab[1:lng], srt = 90)
text(mydf$position[lg1:lg2],mydf$y.start[lg1:lg2]+0.65, mydf$mylab[lg1:lg2], srt = 90)

You can see some areas are crowed with the labels - when x value is same or similar. I want just to display only one label (when there is multiple label at same point). For example,

mydf$position[1:5] are all 0,

but corresponding labels mydf$mylab[1:5] -

 MR1  MR2  MR3  MR4  MR5 

I just want to display the first one "MR1".

Similarly the following points are too close (say the difference of 0.35), they should be considered a single cluster and first label will be displayed. In this way I would be able to get rid of overcrowding of labels. How can I achieve it ?

enter image description here

like image 533
SHRram Avatar asked Feb 20 '13 02:02

SHRram


People also ask

How do I stop labels from overlapping in R?

To avoid overlapping labels in ggplot2, we use guide_axis() within scale_x_discrete(). In the place of we can use the following properties: n. dodge: It makes overlapping labels shift a step-down.

How do I repel labels in ggplot2?

ggrepel provides geoms for ggplot2 to repel overlapping text labels: geom_text_repel() geom_label_repel()

How do I add labels to a data point in R?

To add labels to scatterplot points in base R you can use the text() function, which uses the following syntax: text(x, y, labels, …)


1 Answers

If you space the labels out and add some extra lines you can label every marker.

clpl <- function(xdata, names, y=1, dy=0.25, add=FALSE){
  o = order(xdata)
  xdata=xdata[o]
  names=names[o]
  if(!add)plot(0,type="n",ylim=c(y-1,y+2),xlim=range(xdata),axes=F,ylab="",xlab="")
  abline(h=1,lwd=4)
  dy=0.25
  segments(xdata,y-dy,xdata,y+dy)
  tpos = seq(min(xdata),max(xdata),len=length(xdata))
  text(tpos,y+2*dy,names,srt=90,adj=0)
  segments(xdata,y+dy,tpos,y+2*dy)
}

Then using your data:

clpl(mydf$position[lg1:lg2],mydf$mylab[lg1:lg2])

gives:

marking lines with callouts

You could then think about labelling clusters underneath the main line.

I've not given much thought to doing multiple lines in a plot, but I think with a bit of mucking with my code and the add parameter it should be possible. You could also use colour to show clusters. I'm fairly sure these techniques are present in some of the clustering packages for R...

Obviously with a lot of markers even this is going to get smushed, but with a lot of clusters the same thing is going to happen. Maybe you end up labelling clusters with a this technique?

like image 146
Spacedman Avatar answered Oct 10 '22 05:10

Spacedman