I am a new user of the R data.table
package, and I have noticed something unusual in my data.tables that I have not found explained in the documentation or elsewhere on this site.
When using data.table package
within Rstudio, and viewing a specific data.table within the 'Environment' panel, I see the following string appearing at the end of the data.table
attr(*,"internal.selref")=<externalptr>
If I print the same data.table within the Console, this string does not appear.
Is this a bug, or just an inherent feature of data.table (or Rstudio)? Should I be concerned about whether this is affecting how these data are handled by downstream processes?
The versions I am running are as follows:
data.table Version 1.9.6
Rstudio Version 0.99.447
OSX 10.10.5
Apologies in advance if this is just me being an ignorant newbie.
I actually asked Matt Dowle, the primary author of the data.table package, this very question a little while ago.
Is this a bug, or just an inherent feature of data.table (or Rstudio)?
Apparently this attribute is used internally by data.table, it isn't a bug in RStudio, in fact RStudio is doing its job of showing the attributes of the object.
Should I be concerned about whether this is affecting how these data are handled by downstream processes?
No, this isn't going to affect anything.
For those who are curious about why this attribute is created, I believe it's explained in the data.table manual under the section for setkey():
In v1.7.8, the key<- syntax was deprecated. The <- method copies the whole table and we know of no way to avoid that copy without a change in R itself. Please use the set* functions instead, which make no copy at all. setkey accepts unquoted column names for convenience, whilst setkeyv accepts one vector of column names. The problem (for data.table) with the copy by key<- (other than being slower) is that R doesn’t maintain the over allocated truelength, but it looks as though it has. Adding a column by reference using := after a key<- was therefore a memory overwrite and eventually a segfault; the over allocated memory wasn’t really there after key<-’s copy. data.tables now have an attribute .internal.selfref to catch and warn about such copies. This attribute has been implemented in a way that is friendly with identical() and object.size(). For the same reason, please use the other set* functions which modify objects by reference, rather than using the <- operator which results in copying the entire object.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With