Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does data.table get the column name from j?

Tags:

r

data.table

For example:

dt <- data.table()
x=1:5
> dt[,list(2,3,x)]
   V1 V2 x
1:  2  3 1
2:  2  3 2
3:  2  3 3
4:  2  3 4
5:  2  3 5

The resulting data.table has column x

For some reason, I would like to create a function to simplify data.table construction.

tt <- function(a, b, ...){
    list(a=sum(a), b=sum(b), ...)
}

> dt[,tt(1:2,1:3,x)]
   a b V3
1: 3 6  1
2: 3 6  2
3: 3 6  3
4: 3 6  4
5: 3 6  5

So whenever I call list, I use tt instead, so it auto inserts predefined columns for me. However, now it doesn't recognize the shortcut naming for x.

How to improve tt to auto name column like list in data.table if it is not too hard?

Aim

dt[,tt(1:2,1:3,x)]

Returns

   a b  x
1: 3 6  1
2: 3 6  2
3: 3 6  3
4: 3 6  4
5: 3 6  5

Solution

tt <- function(a, b, ...){
    dots <- list(...)
    inferred <- sapply(substitute(list(...)), function(x) deparse(x)[1])[-1]
    if(is.null(names(inferred))){
        names(dots) <- inferred
    } else {
        names(dots)[names(inferred) == ""] <- inferred[names(inferred) == ""]
    }
    c(a=sum(a), b=sum(b), dots)
}

dt <- data.table(c=1:5)
x=1:5

> dt[,tt(1:2,1:3,x,c+1)]
   a b x c + 1
1: 3 6 1     2
2: 3 6 2     3
3: 3 6 3     4
4: 3 6 4     5
5: 3 6 5     6
> dt[,tt(1:2,1:3,x, z=c+1)]
   a b x z
1: 3 6 1 2
2: 3 6 2 3
3: 3 6 3 4
4: 3 6 4 5
5: 3 6 5 6

Update

Recently I found that there was some bug in page 46 of S Programming from Venables & Ripley. I made some modifications and put it here. Hopefully it would be useful to some people.

# Get the best names vector for arguments like what data.frame does.
# Modified from page 46 of S Programming from Venables & Ripley.
# http://stackoverflow.com/questions/20545476/how-does-data-table-get-the-column-name-from-j
name.args <- function(...){
    # Get a list of arguments.
    dots <- as.list(substitute(list(...)))[-1]
    # Get names of the members if they have, otherwise "".
    # If a list have no named members, it returns NULL.
    nm <- names(dots)
    # If all arguments are named, return the names directly.
    # Otherwise it would cause a problem when do nm[logic(0)] <- list().
    if (!is.null(nm) && all(nm != ""))
        return(nm)
    # Handle empty argument list case.
    if (length(dots) == 0)
        return(character(0))
    # Get positions of arguments without names.
    fixup <- 
        if (is.null(nm))
            seq(along=dots)
        else
            nm == ""
    dep <- sapply(dots[fixup], function(x) deparse(x)[1])
    if (is.null(nm))
        dep
    else {
        nm[fixup] <- dep
        nm
    }
}

# Example
# x <- 1:2
# name.args(x, y=3, 5:6)
# name.args(x=x, y=3)
# name.args()
like image 856
colinfang Avatar asked Dec 12 '13 13:12

colinfang


People also ask

What is J in data table?

J is a direct alias for data. table but results in clearer more readable code. SJ : (S)orted (J)oin. The same value as J() but additionally setkey() is called on all the columns in the order they were passed in to SJ.

How can you identify a column in a Datatable?

By using the Column name or Column index we can identify a column in a data table.

How do I get column names from a dataset in R?

To access a specific column in a dataframe by name, you use the $ operator in the form df$name where df is the name of the dataframe, and name is the name of the column you are interested in. This operation will then return the column you want as a vector.

How do you read a data table?

A table can be read from left to right or from top to bottom. If you read a table across the row, you read the information from left to right. In the Cats and Dogs Table, the number of black animals is 2 + 2 = 4. You'll see that those are the numbers in the row directly to the right of the word 'Black.


1 Answers

A simple solution would be to pass in additional arguments as named rather than unnamed arguments:

dt[,tt(1:2,1:3,x=x)]   ## Note that this uses `x=x` rather than just `x`
#    a b x
# 1: 3 6 1
# 2: 3 6 2
# 3: 3 6 3
# 4: 3 6 4
# 5: 3 6 5

Or for the truly lazy, something like this ;)

tt <- function(a, b, ...){
    dots <- list(...)
    names(dots) <- as.character(substitute(list(...))[-1])
    c(a=sum(a), b=sum(b), dots)
}
dt[,tt(1:2,1:3,x)]
#    a b x
# 1: 3 6 1
# 2: 3 6 2
# 3: 3 6 3
# 4: 3 6 4
# 5: 3 6 5
like image 76
Josh O'Brien Avatar answered Sep 21 '22 05:09

Josh O'Brien