I would like to merge two data.table
together by reference without having to write down all variables I want to merge. Here is a simple example to understand my needs :
set.seed(20170711)
(a <- data.table(v_key=seq(1, 5), key="v_key"))
# v_key
#1: 1
#2: 2
#3: 3
#4: 4
#5: 5
a_backup <- copy(a)
(b <- data.table(v_key=seq(1, 5), v1=runif(5), v2=runif(5), v3=runif(5), key="v_key"))
# v_key v1 v2 v3
#1: 1 0.141804303 0.1311052 0.354798849
#2: 2 0.425955903 0.3635612 0.950234261
#3: 3 0.001070379 0.4615936 0.359660693
#4: 4 0.453054854 0.5768500 0.008470552
#5: 5 0.951767837 0.1649903 0.565894298
I want to copy every columns of b
into a
by reference without specifying the column names.
I could do the following, but that would make a copy of the object for no reason, reducing the performance of my program and increasing the RAM needed :
(a <- a[b])
# v_key v1 v2 v3
#1: 1 0.141804303 0.1311052 0.354798849
#2: 2 0.425955903 0.3635612 0.950234261
#3: 3 0.001070379 0.4615936 0.359660693
#4: 4 0.453054854 0.5768500 0.008470552
#5: 5 0.951767837 0.1649903 0.565894298
Another option (without useless copy) would be to specify the name of every column of b
, resulting in the following :
a <- copy(a_backup)
a[b, `:=`(v1=v1, v2=v2, v3=v3)][]
# v_key v1 v2 v3
#1: 1 0.141804303 0.1311052 0.354798849
#2: 2 0.425955903 0.3635612 0.950234261
#3: 3 0.001070379 0.4615936 0.359660693
#4: 4 0.453054854 0.5768500 0.008470552
#5: 5 0.951767837 0.1649903 0.565894298
In brief, I would like to have the efficiency of my second example (no useless copy) without having to specify every column names in b
.
I guess I could find a way of doing it using a combination of the colnames()
and get()
functions, but I am wondering if there is a cleaner way to do it, syntax is so important for me.
To join two data frames (datasets) vertically, use the rbind function. The two data frames must have the same variables, but they do not have to be in the same order. If data frameA has variables that data frameB does not, then either: Delete the extra variables in data frameA or.
On the Data tab, under Tools, click Consolidate. In the Function box, click the function that you want Excel to use to consolidate the data. In each source sheet, select your data, and then click Add. The file path is entered in All references.
Modify / Add / Delete columns To modify an existing column, or create a new one, use the := operator. Using the data. table := operator modifies the existing object 'in place', which has the benefit of being memory-efficient. Memory management is an important aspect of data.
As you wrote, a combination of colnames
and mget
could get you there.
Consider this:
# retrieve the column names from b - without the key ('v_key')
thecols = setdiff(colnames(b), key(b))
# assign them to a
a[b, (thecols) := mget(thecols)]
This is not too bad-looking, is it?
Besides, I don't think another syntax is currently implemented with data.table
. But I would be happy to be proven wrong :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With