Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

update one column twice in a data.table efficient in R

I have a data table that looks like this:

DT <- data.table(Zeit = c(0.024, 0.4, 0.05),
                 Gier = c(1, 2, 3),
                 GierVZ = c(1, 0, 1),
                 Quer = c(2, 4, 6))

Now I want to update and add some columns to this data table. But I am not able to update Gier twice because it would create a duplicate and get an error.

DT[, ':='(Zeit   = round(Zeit, digits = 2),
          Gier   = replace(Gier, Gier == 163.83, NA),
          GierVZ = factor(GierVZ, levels = c(0, 1), labels = c("positiv", "negativ")),
          Quer   = Quer * 9.81,
          Gier   = ifelse(GierVZ == "negativ", Gier * -1, Gier))]

How can I avoid this in general and still create some readable fast code? I am sure there is an easy answer to this. But I am kind of a newbie to data tables and I think (at least at the moment) it is not that intuitive like dplyr, but it is much faster for my big data.

like image 326
Bolle Avatar asked Dec 22 '22 18:12

Bolle


2 Answers

You could evaluate Gier in curly braces:

DT[, ':='(Zeit   = round(Zeit, digits = 2),
          Gier   = {Gier[Gier == 163.83] <- NA; ifelse(GierVZ, -Gier, Gier)},
          GierVZ = factor(GierVZ, levels = c(0, 1), labels = c("positiv", "negativ")),
          Quer   = Quer * 9.81)]
like image 61
user12728748 Avatar answered Jan 18 '23 19:01

user12728748


This approach has roughly the same level of readability to me & accomplishes your goal:

DT[ , `:=`(
  Zeit = round(Zeit, digits=2L),
  GierVZ = factor(GierVZ, levels = c(0, 1), labels = c("positiv", "negativ")),
  Quer   = Quer * 9.81
)]
DT[Gier == 163.83, Gier := NA]
DT[ , Gier := fifelse(GierVZ == "negativ", Gier * -1, Gier))]

Alternatively, in the development version of data.table (Installation instructions), you could benefit from fcase:

DT[ , `:=`(
  Zeit   = round(Zeit, digits=2L),
  GierVZ = factor(GierVZ, levels = c(0, 1), labels = c("positiv", "negativ")),
  Quer   = Quer * 9.81
  Gier   = fcase(
      Gier == 163.83    , NA_real_, 
    GierVZ == 'negative',    -Gier, 
    GierVZ == 'positiv' ,     Gier
  )
)]

It would be easier if you could skip writing out the last GierVZ=='positiv' condition; this is a feature request in progress.

like image 25
MichaelChirico Avatar answered Jan 18 '23 21:01

MichaelChirico