Is there any function etc which avoids overlapping data labels for identical data points in a scatter plot? I have checked the various questions/responses to textxy, direct.label, and geom_text(), but I haven't been successful. Maybe it's simply not possible.
Here's a sample of the relevant data:
structure(list(cowc = structure(c(5L, 7L, 24L, 24L, 23L, 36L,
34L, 38L, 23L, 6L, 8L, 38L, 38L, 23L, 5L, 7L, 24L, 24L, 23L,
36L, 34L, 38L, 23L, 6L, 8L, 38L, 38L, 23L), .Label = c("AFG",
"ANG", "AZE", "BNG", "BOS", "BUI", "CAM", "CDI", "CHA", "COL",
"CRO", "DOM", "DRC", "ETH", "GNB", "GRG", "GUA", "IND", "INS",
"IRQ", "KEN", "LAO", "LBR", "LEB", "MAL", "MLD", "MZM", "NEP",
"NIC", "PHI", "PNG", "RUS", "RWA", "SAF", "SAL", "SIE", "SOM",
"SUD", "TAJ", "UKG", "YAR", "ZIM"), class = "factor"), conflict = c("Bosnia 92-95",
"Cambodia 70-91", "Lebanon 58-58", "Lebanon 75-89", "Liberia 89-93",
"SieLeo 91-96", "Stafrica 83-91", "Sudan 63-72", "Liberia 94-96",
"Burundi 1993-2005", "Cote d'Ivoire 2002-2007", "Darfur, Sudan 2003-2010",
"Sudan 83-05", "Liberia 1999-2003", "Bosnia 92-95", "Cambodia 70-91",
"Lebanon 58-58", "Lebanon 75-89", "Liberia 89-93", "SieLeo 91-96",
"Stafrica 83-91", "Sudan 63-72", "Liberia 94-96", "Burundi 1993-2005",
"Cote d'Ivoire 2002-2007", "Darfur, Sudan 2003-2010", "Sudan 83-05",
"Liberia 1999-2003"), totalps = c(3L, 2L, 2L, 2L, 1L, 3L, 4L,
3L, 1L, 3L, 3L, 4L, 3L, 3L, 3L, 2L, 2L, 2L, 1L, 3L, 4L, 3L, 1L,
3L, 3L, 4L, 3L, 3L), vetotype = structure(c(1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("strictvetos", "lenientvetos"
), class = "factor"), intensity = c(3L, 4L, 2L, 5L, 2L, 2L, 2L,
2L, 2L, 3L, 2L, 2L, 2L, 2L, 3L, 4L, 2L, 6L, 2L, 2L, 4L, 2L, 2L,
3L, 3L, 2L, 2L, 2L)), .Names = c("cowc", "conflict", "totalps",
"vetotype", "intensity"), class = "data.frame", row.names = c(NA,
-28L))
Here's my code:
vetotype.plot <- ggplot(vetotype.x, aes(x=totalps, y=intensity, color=conflict))+
geom_point() +
labs(x="number of power-sharing arenas", y="intensity") +
ggtitle("Number of Power-Sharing areas and Veto intensity") +
geom_text(aes(label=conflict),hjust=0, vjust=0, size=4)+
scale_x_continuous(limits=c(1, 5))+
theme(legend.position="none")+
facet_wrap(~vetotype, nrow=2)
plot(vetotype.plot)
And below is my graph. I manually highlighted those data points which are overlapping.
What I am looking for is an 'automatic' way to get the labels of the overlapping data points displayed in way so that they don't overlap. Is there any function for this purpose? Many thanks!
This is not a completely general solution, but it does seem to work in your case.
library(ggplot2)
# identify duplicated points
dupes <- aggregate(conflict~totalps+intensity+vetotype,vetotype.x,length)
colnames(dupes)[4] = "dupe"
df <- merge(vetotype.x,dupes) # add dupe column
df$vjust <- 0 # default vertical offset is 0
# calculate vertical offsets based on number of dupes
for (i in 2:max(df$dupe)) df[df$dupe==i,]$vjust<-seq(-trunc(i/2),-trunc(i/2)+i-1)
# render the plot
vetotype.plot <- ggplot(df, aes(x=totalps, y=intensity, color=conflict))+
geom_point() +
labs(x="number of power-sharing arenas", y="intensity") +
ggtitle("Number of Power-Sharing areas and Veto intensity") +
geom_text(aes(label=conflict,vjust=vjust), hjust=0,size=4)+
scale_x_continuous(limits=c(1, 5))+
scale_y_continuous(limits=c(1, 6))+
theme(legend.position="none")+
facet_wrap(~vetotype, nrow=2)
plot(vetotype.plot)
ggrepel can now do this easily:
https://twitter.com/slowkow/status/686341190749392896
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With