Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R passing data.table parameters through function calls

Tags:

r

data.table

so if I have a data.table defined as:

> dt <- data.table (x=c(1,2,3,4), y=c("y","n","y","m"), z=c("pickle",3,8,"egg"))

    > dt
        x   y        z 
    1:  1   y   pickle
    2:  2   n        3
    3:  3   y        8
    4:  4   m      egg

And a variable

    fn <- "z"

I get that I can pull a column from the data.table by the following:

    > dt[,fn, with=FALSE]

What I don't know how to do is the data.table equivalent of the following:

    > factorFunction <- function(df, fn) {
      df[,fn] <- as.factor(df[,fn])
      return(df)
     }

If I set fn="x" and call factorFunction(data.frame(dt),fn) it works just fine.

So I try it with a data.table, but this doesn't work

    > factorFunction <- function(dt, fn) {
      dt[,fn, with=FALSE] <- as.factor(dt[,fn, with=FALSE])
      return(dt)
     }

Error in sort.list(y) : 'x' must be atomic for 'sort.list' Have you called 'sort' on a list?

like image 550
David Wagle Avatar asked Jan 09 '23 03:01

David Wagle


2 Answers

You can try

 dt[,(fn):= factor(.SD[[1L]]),.SDcols=fn]

If there are multiple columns, use lapply(.SD, factor)

Wrapping it in a function

factorFunction <- function(df, fn) {
 df[, (fn):= factor(.SD[[1L]]), .SDcols=fn]
 }

 str(factorFunction(dt, fn))
 #Classes ‘data.table’ and 'data.frame':    4 obs. of  3 variables:
 #$ x: num  1 2 3 4
 #$ y: chr  "y" "n" "y" "m"
 #$ z: Factor w/ 4 levels "3","8","egg",..: 4 1 2 3
like image 135
akrun Avatar answered Jan 10 '23 18:01

akrun


Similar to @akrun's answer:

class(dt[[fn]])
#[1] "character"

setFactor <- function(DT, col) {
  #change the column type by reference
  DT[, c(col) := factor(DT[[col]])]
  invisible(NULL)
  }

setFactor(dt, fn)
class(dt[[fn]])
#[1] "factor"
like image 29
Roland Avatar answered Jan 10 '23 17:01

Roland