Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

nth lowest value from from previous k values

I have the following data table (used the data.table package).

Month   ER
196307  -0.39359311
196308  5.06729343
196309  -1.56222299
196310  2.53955005
196311  -0.85428909

For each of the rows, I want to add a column VR, which has the second highest value from the previous three values of ER. For example for 196311, it would be 2.53955, for 196310 it would be -0.39359311.

like image 912
riskiem Avatar asked Dec 31 '22 19:12

riskiem


2 Answers

In data.table you can use frollapply to perform rolling calculation.

library(data.table)

n <- 2

setDT(df)[, VR := frollapply(shift(ER), 4, 
                    function(x) sort(x, decreasing = TRUE)[n], fill = NA)]
df

#    Month     ER     VR
#1: 196307 -0.394     NA
#2: 196308  5.067     NA
#3: 196309 -1.562     NA
#4: 196310  2.540 -0.394
#5: 196311 -0.854  2.540
like image 93
Ronak Shah Avatar answered Jan 18 '23 14:01

Ronak Shah


(@akrun beat me by a few seconds.) Not sure if you have a literal data.table (i.e., as in the data.table package): in any case, rollapply (from zoo if you have "regular" data) or frollapply (if you are using data.table) ...

dd <- read.table(header=TRUE, text="
Month   ER
196307  -0.39359311
196308  5.06729343
196309  -1.56222299
196310  2.53955005
196311  -0.85428909
")

library(zoo)

dd$VR <- rollapply(dd$ER, width=4, FUN= function(x) sort(x[-4])[2],
                   fill=NA, align="right")
## [1]         NA         NA         NA -0.3935931  2.5395500

Since rollapply includes the current value, in order to get the second-highest of the previous 3 values, I take four values at a time and ignore the fourth (x[-4]); @akrun's solution uses lag() instead (which seems slightly better, although it does induce a dependency on the tidyverse, which you might not want).

Also @akrun's solution sorts on -x (to reverse the order): in this case, since you specified the second-highest of three values, it doesn't matter which order you sort in, but your question's title ("nth lowest from previous k") does suggest that reverse-ordering would be a good idea.

like image 37
Ben Bolker Avatar answered Jan 18 '23 12:01

Ben Bolker