I am using the Australian AIDS Survival Data. This time to create scatterplots.
To show the genders in survival of different Reported transmission category (T.categ), I plot the chart in this way:
data <- read.csv("https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/MASS/Aids2.csv")
data %>%
ggplot() +
geom_jitter(aes(T.categ, sex, colour = status))
It shows a chart. But each time I run the code, it seems to produce a different chart. Here are 2 of them putting together.
Anything wrong with the codes? Is it normal (each run a different chart)?
if you use geom_point
instead of geom_jitter
, you can add position = position_jitter()
, which accepts the seed argument:
library(ggplot2)
p <- ggplot(mtcars, aes(as.factor(cyl), disp))
p + geom_point(position = position_jitter(seed = 42))
p + geom_point(position = position_jitter(seed = 1))
And back to "42"
p + geom_point(position = position_jitter(seed = 42))
Created on 2020-07-02 by the reprex package (v0.3.0)
Try setting the seed when plotting:
set.seed(1); data %>%
ggplot() +
geom_jitter(aes(T.categ, sex, colour = status))
From the manual ?geom_jitter
:
It adds a small amount of random variation to the location of each point, and is a useful way of handling overplotting caused by discreteness in smaller datasets.
To have that "random variation" reproducible, we need to set set.seed
when plotting.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With