I have a rather dense scatterplot that I am constructing with R 'ggplot2' and I want to label a subset of points using 'ggrepel'. My problem is that I want to plot ALL points in the scatterplot, but only label a subset with ggrepel, and when I do this, ggrepel doesn't account for the other points on the plot when calculating where to put the labels, which leads to labels which overlap other points on the plot (which I don't want to label).
Here is an example plot illustrating the issue.
# generate data:
library(data.table)
library(stringi)
set.seed(20180918)
dt = data.table(
name = stri_rand_strings(3000,length=6),
one = rnorm(n = 3000,mean = 0,sd = 1),
two = rnorm(n = 3000,mean = 0,sd = 1))
dt[, diff := one -two]
dt[, diff_cat := ifelse(one > 0 & two>0 & abs(diff)>1, "type_1",
ifelse(one<0 & two < 0 & abs(diff)>1, "type_2",
ifelse(two>0 & one<0 & abs(diff)>1, "type_3",
ifelse(two<0 & one>0 & abs(diff)>1, "type_4", "other"))))]
# make plot
ggplot(dt, aes(x=one,y=two,color=diff_cat))+
geom_point()
If I plot only the subset of points I want labelled, then ggrepel is able to place all of the labels in a non-overlapping fashion with respect to other points and labels.
ggplot(dt[abs(diff)>2 & (!diff_cat %in% c("type_3","type_4","other"))],
aes(x=one,y=two,color=diff_cat))+
geom_point()+
geom_text_repel(data = dt[abs(diff)>2 & (!diff_cat %in% c("type_3","type_4","other"))],
aes(x=one,y=two,label=name))
However when I want to plot this subset of data AND the original data at the same time, I get overlapping points with labels:
# now add labels to a subset of points on the plot
ggplot(dt, aes(x=one,y=two,color=diff_cat))+
geom_point()+
geom_text_repel(data = dt[abs(diff)>2 & (!diff_cat %in% c("type_3","type_4","other"))],
aes(x=one,y=two,label=name))
How can I get the labels for the subset of points to not overlap the points from the original data?
How to create ggplot labels in R 1 Creating a scatter plot with ggplot. The next group of code creates a ggplot scatter plot with that data, including sizing points by total county population and coloring them by ... 2 Focusing attention on subsets of data with ggrepel. ... 3 Customizing labels and lines with ggrepel. ...
The ggrepel package has its own versions of ggplot’s text and label geom functions: geom_text_repel () and geom_label_repel (). Using those functions’ defaults will automatically move one of the labels below its point so it doesn’t overlap with the other one.
ggrepel provides geoms for ggplot2 to repel overlapping text labels: Text labels repel away from each other, away from data points, and away from edges of the plotting area (panel). The latest development version may have new features, and you can get it from GitHub:
We can use stat_summary () with geom = "text_repel". Note: When we use ggplot2::stat_summary () with ggrepel, we should prefer position_nudge_repel () instead of ggplot2::position_nudge (). The position_nudge_repel () function nudges the text label’s position, but it also remembers the original position of the data point.
You can try the following:
""
) to all the other points from the original data, so that geom_text_repel
takes them into consideration when repelling labels from one another;box.padding
parameter from the default 0.25
to some larger value, for greater distance between labels;Example code (with box.padding = 1
):
ggplot(dt,
aes(x = one, y = two, color = diff_cat)) +
geom_point() +
geom_text_repel(data = . %>%
mutate(label = ifelse(diff_cat %in% c("type_1", "type_2") & abs(diff) > 2,
name, "")),
aes(label = label),
box.padding = 1,
show.legend = FALSE) + #this removes the 'a' from the legend
coord_cartesian(xlim = c(-5, 5), ylim = c(-5, 5)) +
theme_bw()
Here's another attempt, with box.padding = 2
:
(Note: I'm using ggrepel 0.8.0. I'm not sure if all the functionalities are present for earlier package versions.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With