Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Percentile for Each Observation w/r/t Grouping Variable

Tags:

r

I have some data that looks like the following. It is grouped by variable "Year" and I want to extract the percentiles of each observation of Score, with respect to the Year it is from, preferably as a vector.

Year   Score
2001   89
2001   70
2001   72
2001   ...
..........
2004   87
2004   90

etc.

How can I do this? aggregate will not work, and I do not think apply will work either.

like image 277
Ryan R. Rosario Avatar asked Jan 29 '10 06:01

Ryan R. Rosario


People also ask

How do you find the percentile of a data set in R?

So how to find percentiles in R? You find a percentile in R by using the quantiles function. It produces the percentage with the value that is the percentile. This is the default version of this function, and it produces the 0th percentile, 25th percentile, 50th percentile, 75th percentile, and 100th percentile.

What is a percentile of a variable?

Percentiles indicate the percentage of scores that fall below a particular value. They tell you where a score stands relative to other scores. For example, a person with an IQ of 120 is at the 91st percentile, which indicates that their IQ is higher than 91 percent of other scores.

How percentile is calculated?

Step 1: Arrange all data values in the data set in ascending order. Step 2: Count the number of values in the data set where it is represented as 'n'. Step 3: calculate the value of k/100, where k = any number between zero and one hundred. Step 4: Multiply 'k' percent by 'n'.


2 Answers

Following up on Vince's solution, you can also do this with plyr or by:

ddply(df, .(years), function(x) transform(x, percentile=ecdf(x$scores)(x$scores)))
like image 168
Jonathan Chang Avatar answered Oct 21 '22 03:10

Jonathan Chang


Using ave

ave(d1$scores, d1$year, FUN=function(x) ecdf(x)(x))
like image 40
Eduardo Leoni Avatar answered Oct 21 '22 04:10

Eduardo Leoni