quantile(X, prob = seq(0, 1, length = 5), type = 5)
How would you transfer this into a data.table operation to add a new column using := and assign a value to each ID where if the value falls within the bins to assign an appropriate ordered value like 25%=1, 50%=2 etc for each ID?
You could use findInterval. This will allow you to use quantile, and the various definitions thereof.
eg
findInterval(x, quantile(x,type=5), rightmost.closed=TRUE)
# It is fast
set.seed(1)
DT <- data.table(x=rnorm(1e6))
library(microbenchmark)
microbenchmark(
order = DT[order(x),bin:=ceiling(.I/.N*5)],
findInterval = DT[, b2 :=findInterval(x, quantile(x,type=5), rightmost.closed=TRUE)],times=10 )
## Unit: milliseconds
## expr min lq median uq max neval
## order 551.31154 568.20324 573.36605 640.3255 655.5024 10
## findInterval 70.16782 79.11459 80.36363 140.2807 147.3080 10
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With