I would like to subset data based on a match in one column and no match from another using data.table
, J()
and !J()
functions
library(data.table)
DT <- data.table(x = rep(c("a", "b", "c"), each=2000), y=c(rep(c(1,3,6), each = 1)) , key = c("x", "y"))
I am looking to have the J()
and !J()
functions provide the same result as the code below:
DT[J("b")][y !=1]
I tried the following and it gave the following error:
DT[J("b")][!J(x, 1)]
Error in vecseq(f__, len__, if (allow.cartesian) NULL else as.integer(max(nrow(x), :
Join results in 1920000 rows; more than 4800 = max(nrow(x),nrow(i)). Check for duplicate key values in i, each of which join to the same group in x over and over again. If that's ok, try including `j` and dropping `by` (by-without-by) so that j runs for each group to avoid the large allocation. If you are sure you wish to proceed, rerun with allow.cartesian=TRUE. Otherwise, please search for this error message in the FAQ, Wiki, Stack Overflow and datatable-help for advice.
I tried the code below but it did not eliminate the second condition which is not to include 1
DT[J("b")][!J("1")]
The lesson has conveyed that keys can have multiple values. If the exercise text suggests that anywhere, it's certainly incorrect. Each key can only have one value. But the same value can occur more than once inside a Hash, while each key can occur only once.
Output: The output will be as follows. In this way, we can insert multiple values associated with the same key into the HashMap using Collections.
This answer came from Arun. All the credit goes to Arun
library(data.table)
DT <- data.table(x = rep(c("a", "b", "c"), each=2000), y=c(rep(c(1,3,6), each = 1)) , key = c("x", "y"))
DT["b"][!J(unique(x), 1)]
This subsets the data based on a match for all rows containing b
in column x
and no match to 1
in all the rows of column y
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With