Here is my reproducible data:
library("ggplot2")
library("ggplot2movies")
library("quantreg")
set.seed(2154)
msamp <- movies[sample(nrow(movies), 1000), ]
I am trying to become acquainted with stat_quantile but the example from the documentation raises a couple of questions.
mggp <- ggplot(data=msamp, mapping=aes(x=year, y=rating)) +
geom_point() +
stat_quantile(formula=y~x, quantiles=c(0, 0.25, 0.50, 0.75, 1)) +
theme_classic(base_size = 12) +
ylim(c(0,10))
mggp
To my understanding quantiles split the data into parts that are smaller than the defined cut-off values, correct? If I define quantiles like in the following code I get five lines. Why? What do they represent?
It seems that the quantiles are calculated based on the dependent variable on the y-axis (rating). Is it possible to reverse this? I mean to split the data based on quantiles in 'year'?
This function performs quantile regression, and each line is an indicator of the
From Wikipedia:
Quantile regression is a type of regression analysis used in statistics and econometrics. Whereas the method of least squares results in estimates that approximate the conditional mean of the response variable given certain values of the predictor variables, quantile regression aims at estimating either the conditional median or other quantiles of the response variable.
Thus each line in the regression plot is an estimate of the quantile value, e.g. median, 75th and 100th percentile.
You can find a detailed technical discussion in the vignette of the quantreg package.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With