Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R data.table compute new column, but insert at beginning

Tags:

r

data.table

In R data.tables, I can use this syntax to add a new column:

> dt <- data.table(a=c(1,2), b=c(3,4))
> dt[, c := a + b]
> dt
   a b c
1: 1 3 4
2: 2 4 6

But how would I insert c at the front of the dt like so:

   c a b
1: 4 1 3
2: 6 2 4

I looked on SO, and found some people suggesting cbind for data.frames, but it's more convenient for me to use the := syntax here, so I was wondering if there was a data.table sanctioned way of doing this. My data.table has around 100 columns, so I don't want to list them all out.

like image 515
user3685285 Avatar asked Feb 12 '18 20:02

user3685285


People also ask

How do I add a column before a column in R?

To add a new column to a dataframe in R you can use the $-operator. For example, to add the column “NewColumn”, you can do like this: dataf$NewColumn <- Values . Now, this will effectively add your new variable to your dataset.


1 Answers

Update: This feature has now been merged into the latest CRAN version of data.table (starting with v1.11.0), so installing the development version is no longer necessary to use this feature. From the release notes:

  1. setcolorder() now accepts less than ncol(DT) columns to be moved to the front, #592. Thanks @MichaelChirico for the PR.

Current development version of data.table (v1.10.5) has updates to setcolorder() that make this way more convenient by accepting a partial list of columns. The columns provided are placed first, and then all non-specified columns are added after in the existing order.

Installation instructions for development branch here.

Note regarding development branch stability: I've been running it for several months now to utilize the multi-threaded version in fread() in v1.10.5 (that alone is worth the update if you deal with multi-GB .csv files) and I have not noticed any bugs or regressions for my usage.

library(data.table)
DT <- as.data.table(mtcars)
DT[1:5]

gives

    mpg cyl disp  hp drat    wt  qsec vs am gear carb
1: 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
2: 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
3: 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
4: 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
5: 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2

re-order columns based on a partial list:

setcolorder(DT,c("gear","carb"))
DT[1:5]

now gives

   gear carb  mpg cyl disp  hp drat    wt  qsec vs am
1:    4    4 21.0   6  160 110 3.90 2.620 16.46  0  1
2:    4    4 21.0   6  160 110 3.90 2.875 17.02  0  1
3:    4    1 22.8   4  108  93 3.85 2.320 18.61  1  1
4:    3    1 21.4   6  258 110 3.08 3.215 19.44  1  0
5:    3    2 18.7   8  360 175 3.15 3.440 17.02  0  0

If for any reason you don't want to update to the development branch, the following works in previous (and current CRAN) versions.

newCols <- c("gear","carb")
setcolorder(DT,c(newCols, setdiff(newCols,colnames(DT)) ## (Per Frank's advice in comments)

## the long way I'd always done before seeing setdiff()
## setcolorder(DT,c(newCols,colnames(DT)[which(!colnames(DT) %in% newCols)]))
like image 62
Matt Summersgill Avatar answered Oct 29 '22 15:10

Matt Summersgill