I was playing around with <code>data.table</code> and I came across a distinction that I'm not sure I quite understand. Given the following dataset: <pre class="prettyprint"><code>library(data.table) set.seed(400) DT <- data.table(x = sample(LETTERS[1:5], 20, TRUE), key = "x"); DT </code></pre> Can you please explain to me the difference between the following expressions? 1) <code>DT[J("E"), .I]</code> 2) <code>DT[ , .I[x == "E"] ]</code> 3) <code>DT[x == "E", .I]</code>

<pre class="prettyprint"><code>set.seed(400) library(data.table) DT <- data.table(x = sample(LETTERS[1:5], 20, TRUE), key = "x"); DT </code></pre> 1) <pre class="prettyprint"><code>DT[ , .I[x == "E"] ] # [1] 18 19 20 </code></pre> is a data.table where <code>.I</code> is a vector representing the row number of <code>E</code> in the ORIGINAL dataset <code>DT</code> 2) <pre class="prettyprint"><code>DT[J("E") , .I] # [1] 1 2 3 DT["E" , .I] # [1] 1 2 3 DT[x == "E", .I] # [1] 1 2 3 </code></pre> are all the same, producing a vector where <code>.I</code>s are vectors representing the row numbers of the <code>E</code>s in the NEW subsetted data

Understanding .I in data.table in R

Q: Is data table DT == true?

data. table(DT) is TRUE. To better description, I put parts of my original code here. So you may understand where goes wrong.

Tags:

r

data.table

subset

I was playing around with data.table and I came across a distinction that I'm not sure I quite understand. Given the following dataset:

library(data.table)

set.seed(400)
DT <- data.table(x = sample(LETTERS[1:5], 20, TRUE), key = "x"); DT

Can you please explain to me the difference between the following expressions?

1) DT[J("E"), .I]

2) DT[ , .I[x == "E"] ]

3) DT[x == "E", .I]

615

asked May 10 '14 20:05

black_sheep07

1 Answers

set.seed(400)
library(data.table)

DT <- data.table(x = sample(LETTERS[1:5], 20, TRUE), key = "x"); DT

DT[  , .I[x == "E"] ] # [1] 18 19 20

is a data.table where .I is a vector representing the row number of E in the ORIGINAL dataset DT

DT[J("E")  , .I]   # [1] 1 2 3

DT["E"     , .I]   # [1] 1 2 3

DT[x == "E", .I]   # [1] 1 2 3

are all the same, producing a vector where .Is are vectors representing the row numbers of the Es in the NEW subsetted data

114

answered Oct 11 '22 05:10

Ragy Isaac

Related questions
                            
                                Generating a Call Graph in R
                            
                                Is it possible to use R package data in testthat tests or run_examples()?
                            
                                Facet with free scales but keep aspect ratio fixed
                            
                                Is there an R dplyr method for merge with all=TRUE?
                            
                                Why is `row.names` preferred over `rownames`?
                            
                                R not responding request to interrupt stop process
                            
                                In R, how do you loop over the rows of a data frame really fast?
                            
                                R: Insert a vector as a row in data.frame
                            
                                How should I handle 'helper' functions in an R package?
                            
                                S3 method consistency warning when building R package with Roxygen
                            
                                'Embedded nul in string' error when importing csv with fread
                            
                                doParallel error in R: Error in serialize(data, node$con) : error writing to connection
                            
                                Is it possible to modify a data.frame in-place (destructively)?
                            
                                Migrating R libraries
                            
                                How to get vector of options from server.R to ui.R for selectInput in Shiny R App
                            
                                traceback() for interactive and non-interactive R sessions
                            
                                Non character argument in R string split function (strsplit)
                            
                                R - dplyr Summarize and Retain Other Columns
                            
                                Are there any official naming conventions for R?
                            
                                dplyr mutate in R - add column as concat of columns

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With