Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to prevent two labels to overlap in a barchart?

Tags:

r

ggplot2

The image below shows a chart that I created with the code below. I highlighted the missing or overlapping labels. Is there a way to tell ggplot2 to not overlap labels?

enter image description here

week = c(0, 1, 1, 1, 1, 2, 2, 3, 4, 5)
statuses = c('Shipped', 'Shipped', 'Shipped', 'Shipped', 'Not-Shipped', 'Shipped', 'Shipped', 'Shipped', 'Not-Shipped', 'Shipped')

dat <- data.frame(Week = week, Status = statuses)

p <- qplot(factor(Week), data = dat, geom = "bar", fill = factor(Status))
p <- p + geom_bar()
# Below is the most important line, that's the one which displays the value
p <- p + stat_bin(aes(label = ..count..), geom = "text", vjust = -1, size = 3)
p
like image 355
Martin Avatar asked Apr 21 '13 01:04

Martin


People also ask

How do you stop data labels overlapping?

Stop Labels overlapping chartRight click on the Axis. Choose the Format Axis option. Open the Labels dropdown. For label position change it to 'Low'

How do you stop a bar chart overlapping?

Excel is plotting your data on two different axis in the same space. So they will overlap. In order to not have them overlap, we need to add a pad space to push the tea column left and the coffee column right.

How do you avoid overlapping labels in tableau?

In Marks, if you click on the Label icon, at the bottom of the menu uncheck "Allow labels to overlap other marks". You can also manually select and move the overlapping labels right on the chart.

How do you prevent data labels overlapping in a pie chart?

One solution is to display the labels outside the pie chart, which may create more room for longer data labels. If you find that your labels still overlap, you can create more space for them by enabling 3D. This reduces the diameter of the pie chart, creating more space around the chart.


3 Answers

You can use a variant of the well-known population pyramid.

Some sample data (code inspired by Didzis Elferts' answer):

set.seed(654)
week <- sample(0:9, 3000, rep=TRUE, prob = rchisq(10, df = 3))
status <- factor(rbinom(3000, 1, 0.15), labels = c("Shipped", "Not-Shipped"))
data.df <- data.frame(Week = week, Status = status)

Compute count scores for each week, then convert one category to negative values:

library("plyr")
plot.df <- ddply(data.df, .(Week, Status), nrow)
plot.df$V1 <- ifelse(plot.df$Status == "Shipped",
                     plot.df$V1, -plot.df$V1)

Draw the plot. Note that the y-axis labels are adapted to show positive values on either side of the baseline.

library("ggplot2")
ggplot(plot.df) + 
  aes(x = as.factor(Week), y = V1, fill = Status) +
  geom_bar(stat = "identity", position = "identity") +
  scale_y_continuous(breaks = 100 *     -1:5, 
                     labels = 100 * c(1, 0:5)) +
  geom_text(aes(y = sign(V1) * max(V1) / 30, label = abs(V1)))

The plot:

plot

For production purposes you'd need to determine the appropriate y-axis tick labels dynamically.

like image 145
Alexander Vos de Wael Avatar answered Oct 12 '22 23:10

Alexander Vos de Wael


Made new sample data (inspired by code of @agstudy).

week <- sample(0:5,1000,rep=TRUE,prob=c(0.2,0.05,0.15,0.5,0.03,0.1))
statuses <- gl(2,1000,labels=c('Not-Shipped', 'Shipped'))
dat <- data.frame(Week = week, Status = statuses)

Using function ddply() from library plyr made new data frame text.df for labels. Column count contains number of observations in each combination of Week and Status. Then added column ypos that contains cumulative sum of count for each Week plus 15. This will be used for y position. For Not-Shipped ypos replaced with -10.

library(plyr)
text.df<-ddply(dat,.(Week,Status),function(x) data.frame(count=nrow(x)))
text.df<-ddply(text.df,.(Week),transform,ypos=cumsum(count)+15)
text.df$ypos[text.df$Status=="Not-Shipped"]<- -10

Now labels are plotted with geom_text() using new data frame.

ggplot(dat,aes(as.factor(Week),fill=Status))+geom_bar()+
  geom_text(data=text.df,aes(x=as.factor(Week),y=ypos,label=count))

enter image description here

like image 24
Didzis Elferts Avatar answered Oct 12 '22 22:10

Didzis Elferts


One solution to avoid overlaps is to use to dodge position of bars and texts. To avoid missing values you can set ylim. Here an example.

enter image description here

##  I create some more realistic data similar to your picture
week <- sample(0:5,1000,rep=TRUE)
statuses <- gl(2,1000,labels=c('Not-Shipped', 'Shipped'))
dat <- data.frame(Week = week, Status = statuses)

## for dodging
dodgewidth <- position_dodge(width=0.9)
## get max y to set ylim
ymax <- max(table(dat$Week,dat$Status))+20
ggplot(dat,aes(x = factor(Week),fill = factor(Status))) + 
  geom_bar( position = dodgewidth ) +
  stat_bin(geom="text", position= dodgewidth, aes( label=..count..),
           vjust=-1,size=5)+
  ylim(0,ymax)
like image 25
agstudy Avatar answered Oct 12 '22 22:10

agstudy