Joining the data tables:
X <- data.table(A = 1:4, B = c(1,1,1,1))
# A B
# 1: 1 1
# 2: 2 1
# 3: 3 1
# 4: 4 1
Y <- data.table(A = 4)
# A
# 1: 4
via
X[Y, on = .(A == A)]
# A B
# 1: 4 1
returns the expected result. However, I would expect the line:
X[Y, on = .(A < A)]
# A B
# 1: 4 1
# 2: 4 1
# 3: 4 1
to return
A B
1: 1 1
2: 2 1
3: 3 1
because the keyword on
:
Indicate which columns in x should be joined with which columns in i along with the type of binary operator to join with
according to ?data.table
. The way the joining is done is not explicitly mentioned, and certainly it is not as I have guessed. How exactly <
joins columns in x with columns in i?
A table can be read from left to right or from top to bottom. If you read a table across the row, you read the information from left to right. In the Cats and Dogs Table, the number of black animals is 2 + 2 = 4. You'll see that those are the numbers in the row directly to the right of the word 'Black.
To join two data frames (datasets) vertically, use the rbind function. The two data frames must have the same variables, but they do not have to be in the same order. If data frameA has variables that data frameB does not, then either: Delete the extra variables in data frameA or.
A data table is a range of cells in which you can change values in some of the cells and come up with different answers to a problem. A good example of a data table employs the PMT function with different loan amounts and interest rates to calculate the affordable amount on a home mortgage loan.
When doing a non-equi join like X[Y, on = .(A < A)]
data.table returns the A
-column from Y
(the i
-data.table).
To get the desired result, you could do:
X[Y, on = .(A < A), .(A = x.A, B)]
which gives:
A B 1: 1 1 2: 2 1 3: 3 1
In the next release, data.table will return both A
columns. See here for the discussion.
You're partially correct. The missing piece of the puzzle is that (currently) when you perform any join, including a non-equi join with <
, a single column is returned for the join colum (A
in your example). This columns takes the values from the data.table
on the right side of the join, in this case the values in A
from Y
.
Here's an illustrated example:
We're planning to change this behaviour in a future version of data.table
so that both columns will be returned in the case of non-equi joins. See pull requests https://github.com/Rdatatable/data.table/pull/2706 and https://github.com/Rdatatable/data.table/pull/3093.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With