q <- quantile(faithful$eruptions)
> q
0% 25% 50% 75% 100%
1.60000 2.16275 4.00000 4.45425 5.10000
I get the following result, the dataset is provided in R.
head(faithful)
eruptions waiting
1 3.600 79
2 1.800 54
3 3.333 74
4 2.283 62
5 4.533 85
6 2.883 55
I want a dataframe containing the data and an additional column for pointing out the quantile to which each observations belong. For example the final dataset should look like
eruptions waiting Quartile
1 3.600 79 Q1
2 1.800 54 Q2
3 3.333 74
4 2.283 62
5 4.533 85
6 2.883 55
How can this be done?
Quartiles are a type of percentile. The first quartile (Q1, or the lowest quartile) is the 25th percentile, meaning that 25% of the data falls below the first quartile. The second quartile (Q2, or the median) is the 50th percentile, meaning that 50% of the data falls below the second quartile.
Quartiles are used to calculate the interquartile range, which is a measure of variability around the median. The interquartile range is simply calculated as the difference between the first and third quartile: Q3–Q1. In effect, it is the range of the middle half of the data that shows how spread out the data is.
Quartiles tell us about the spread of a data set by breaking the data set into quarters, just like the median breaks it in half.
The upper quartile is the median of the data values in the upper half of data set: The lower quartile is the mean of the data values in the first quarter of data set: The middle quartile is the overall mean: The upper quartile is the mean of the data values in the fourth quarter of data set: The lower quartile is the ...
Something along the lines of this? Use values from quantile
function as values to cut the desired vector.
faithful$kva <- cut(faithful$eruptions, q)
levels(faithful$kva) <- c("Q1", "Q2", "Q3", "Q4")
faithful
eruptions waiting kva
1 3.600 79 Q2
2 1.800 54 Q1
3 3.333 74 Q2
4 2.283 62 Q2
5 4.533 85 Q4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With