I have a data set that list the percentiles for a set of scores like this:
> percentiles
Score Percentile
1 231 0
2 385 1
3 403 2
4 413 3
5 418 4
6 424 5
7 429 6
8 434 7
9 437 8
10 441 9
11 443 10
I would like the "Score" column to run from 100 to 500. That is, I would like Scores 100 to 231 to be associated with a Percentile of 0, Scores 232 to 385 to be associated with a Percentile of 1, etc. Is there a simple way to fill in the values that do not appear in the sequence of "Score" values so it looks like the below data set?
> percentiles
Score Percentile
1 100 0
2 101 0
3 102 0
4 103 0
5 104 0
6 105 0
7 106 0
8 107 0
9 108 0
10 109 0
--------------------
130 229 0
131 230 0
132 231 0
133 232 1
134 233 1
135 234 1
136 235 1
137 236 1
138 237 1
139 238 1
140 239 1
If you convert percentiles
to a data.table, you could do a rolling join with a new table of all scores 100:500. The rolling join with roll = -Inf
gives a fill-backward behavior by itself, but still the 444:500 values are NA so a forward nafill
is added at the end.
library(data.table)
setDT(percentiles)
percentiles[data.table(Score = 100:500), on = .(Score), roll = -Inf
][, Percentile := nafill(Percentile, 'locf')]
# Score Percentile
# 1: 100 0
# 2: 101 0
# 3: 102 0
# 4: 103 0
# 5: 104 0
# ---
# 397: 496 10
# 398: 497 10
# 399: 498 10
# 400: 499 10
# 401: 500 10
You might think about this differently: instead of a data frame to fill, as a set of breaks for binning your scores. Use the scores as the breaks with -Inf
tacked on to have the lower bound. If you need something different to happen for the scores above the highest break, add Inf
to the end of the breaks, but you'll need to come up with an additional label.
library(dplyr)
dat <- data.frame(Score = 100:500) %>%
mutate(Percentile = cut(Score, breaks = c(-Inf, percentiles$Score),
labels = percentiles$Percentile,
right = T, include.lowest = F))
Taking a look at a few of the breaking points:
slice(dat, c(129:135, 342:346))
#> Score Percentile
#> 1 228 0
#> 2 229 0
#> 3 230 0
#> 4 231 0
#> 5 232 1
#> 6 233 1
#> 7 234 1
#> 8 441 9
#> 9 442 10
#> 10 443 10
#> 11 444 <NA>
#> 12 445 <NA>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With