Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using conditional statements in r data.table

Tags:

r

data.table

I am trying to use data.table to recode a variable based on certain conditions. My original dataset has around 30M records and after all variable creation around 130 variables. I used the methods suggested here: conditional statements in data.table (M1) and also here data.table: Proper way to do create a conditional variable when column names are not known? (M2)

My goal is get the equivalent of the below code but something that is applicable using data.table

samp$lf5 <- samp$loadfactor5

samp$lf5 <- with(samp, ifelse(loadfactor5 < 0, 0, lf5))

I will admit that I don't understand .SD and .SDCols very well, so I might be using it wrong. The code and errors from (M1) and (M2) are given below and the sample dataset is here: http://goo.gl/Jp97Wn

(M1)

samp[,lf5 = if(loadfactor5 <0) 0 else loadfactor5]

Error Message

Error in `[.data.table`(samp, , lf5 = if (loadfactor5 < 0) 0 else loadfactor5) : 
unused argument (lf5 = if (loadfactor5 < 0) 0 else loadfactor5)

When I do this:

samp[,list(lf5 = if(loadfactor5 <0) 0 else loadfactor5)]

it gives lf5 as a list but not as part of the samp data.table and does not really apply the condition as lf5 still has values less than 0.

(M2)

Col1 <- "loadfactor5"
Col2 <- "lf5"

setkeyv(samp,Col1)
samp[,(Col2) :=.SD,.SDCols = Col1][Col1<0,(Col2) := .SD, .SDcols = 0]

I get the following error

Error in `[.data.table`(samp, , `:=`((Col2), .SD), .SDCols = Col1) : 
unused argument (.SDCols = Col1)

Any insights on how to finish this appreciated. My dataset has 30M records so I am hoping to use data.table to really cut the run time down.

Thanks,

Krishnan

like image 881
Krishnan Avatar asked Aug 29 '14 15:08

Krishnan


People also ask

What does R mean in conditional statements?

The if statement takes a condition; if the condition evaluates to TRUE , the R code associated with the if statement is executed. if (condition) { expr. } The condition to check appears inside parentheses, while the R code that has to be executed if the condition is TRUE , follows in curly brackets ( expr ).

Does R have if statements?

In R language there are two forms of the if-else conditional statement; the 'if' statement which works on single element vector and the 'ifelse' statement that works on vectors of greater than one element.

How do you do if in R?

To run an if-then statement in R, we use the if() {} function. The function has two main elements, a logical test in the parentheses, and conditional code in curly braces. The code in the curly braces is conditional because it is only evaluated if the logical test contained in the parentheses is TRUE .


1 Answers

Answer provided by eddi and included here for the sake of completeness.

samp[, lf5 := ifelse(loadfactor5 < 0, 0, loadfactor5)]

like image 173
Krishnan Avatar answered Sep 25 '22 14:09

Krishnan