Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filtering Data Table Row Vector That Lies Between 2 Numeric Vectors

Tags:

r

I have a data.table (df) that looks similar to this:

df <- read.table(header=TRUE, text='
ID AltID   Crit1   Crit2   Crit3
1  1       1       5       10
1  2       3       7       15
1  3       2       6       11')

and for each Crit-column I have an upper and lower bound like this:

minCutoff = c(0, 5, 10)
maxCutoff = c(4, 7, 12)

that are calculated from data.table (df).

I'd like a function which excludes any row where one value is out of bounds. In addition I'd like this function to work with a variable number of Crit columns (e.g. 3 Crit columns, 4 Crit columns, etc.) since my input data is subject to change.

So for this example, rows 1 and 3 would be kept but row 2 would be discarded since its Crit3 (15) > maxCutoff (12) despite Crit1 and Crit2 being within the acceptable ranges. The output would therefore be:

ID AltID   Crit1   Crit2   Crit3
1  1       1       5       10
1  3       2       6       11

I've tried solving this using a for loop to count the number of columns I have and then a nested for loop to iterate over the rows using something like...

for (c in 1:(ncol(df)-2)+2) 
{
    for (r in 1:nrow(df)) 
    {
     between(df[r,c], minCutoff[c], maxCutoff[c])
    }
}

*The ncol(df)-2)+2 is due to working around the ID columns

However, now I have a TON of T/F values that I'm having trouble aggregating to determine whether a row should be kept or discarded.

I'm sure there's a magical R way of making this process simpler, but I'm not skilled enough to see it.

If anyone has any tips, tricks, or other threads to point me in the right direction I'd be mighty grateful.

like image 912
Ryan Eller Avatar asked Nov 24 '25 05:11

Ryan Eller


1 Answers

You don't need an external package just to use between, base R can do what you want.

minCutoff <- c(0, 5, 10)
maxCutoff <- c(4, 7, 12)

cols <- grep("^Crit", names(df))

inx <- apply(df[cols], 1, function(x) all(minCutoff <= x & x <= maxCutoff))
df[inx, ]
#  ID AltID Crit1 Crit2 Crit3
#1  1     1     1     5    10
#3  1     3     2     6    11

DATA.

df <- read.table(text = "
ID AltID   Crit1   Crit2   Crit3
1  1       1       5       10
1  2       3       7       15
1  3       2       6       11
", header = TRUE)
like image 60
Rui Barradas Avatar answered Nov 25 '25 20:11

Rui Barradas



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!