Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Quantile normalize a single column in R

I have a column in my dataframe in R, data$height. The values range from 0- 400. I want to normalize the values in the column such the resultant values lie between 0-1 and are quantiles, i.e the median value in the dataset should be reflecting 0.5 as the newer value.

Any guess on how to do this.

like image 526
show_stopper Avatar asked Feb 20 '26 18:02

show_stopper


2 Answers

The R function ppoints is the usual way to map values into their percentile ranks.

See its a argument -

Setting a=1 takes the smallest value to 0 and the largest value to 1

Setting a=0 takes the smallest value to 1/(n+1) and the largest value to n/(n+1)

By default it has a=3/8 (if n is 10 or less) or a=1/2 (when n is larger than 10)

This function is used by other functions in R. For example it is called by qqnorm to do normal quantile-quantile plots.

like image 65
Glen_b Avatar answered Feb 23 '26 08:02

Glen_b


You want some kind of rank, for example as in

> set.seed(1)
> exdf <- data.frame(height = runif(5, min=0, max=400))
> exdf$r1 <- (rank(exdf$height) - 1) / (length(exdf$height)-1)
> exdf$r2 <- (rank(exdf$height)-1/2) /  length(exdf$height)
> exdf 
     height   r1  r2
1 106.20347 0.25 0.3
2 148.84956 0.50 0.5
3 229.14135 0.75 0.7
4 363.28312 1.00 0.9
5  80.67277 0.00 0.1
like image 37
Henry Avatar answered Feb 23 '26 08:02

Henry



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!