Someone recently shared their data table with me via dput()
and an error popped up that I had not previously encountered:
Error: unexpected '<' in: " class = c("data.table", "data.frame"), .internal.selfref = <"
After some digging I found this is related specifically to data.tables and as advised in these answers, removing the internal.selfref = <pointer: 0x7fd60e036ce0>)
did the trick to successfully assign their data.
However, I anticipate sharing these types of data frequently between novice users; I have not found a reasonable/sustainable solution to prevent this from exporting with the dput
, only ad-hoc functions and/or removing it after receiving running it.
If I remove showAttributes
from the control = c("keepNA", "keepInteger", "niceNames", "showAttributes")
in dput
the .internal.selfref
is gone but so is everything else about the structure.
The questions and answers provided in the above linked questions were 5-9 years old; I was hoping that some improved functionality may be available (that I am obviously unaware of) that would tell dput
to ignore this, or perhaps if there is something I can do to the data table itself before dput
that would remove the .internal.selfref
altogether.
Is there a way to provide the dput
of a data.table
object without producing the .internal.selfref
?
Thanks in advance.
Example of the issue:
dattab <- data.table::data.table(a = 1:5, b = 6:10)
dput(dattab)
structure(list(a = 1:5, b = 6:10), row.names = c(NA, -5L),
class = c("data.table", "data.frame"),
.internal.selfref = <pointer: 0x7fd60e036ce0>)
Convert the data.table to a data.frame and then dput
it.
dput(as.data.frame(dattab))
## structure(list(a = 1:5, b = 6:10), row.names = c(NA, -5L), class = "data.frame")
Sometimes I think it's useful to share it with the appropriate class, so I use this function:
#' dput into a single line
#'
#' @param x object
#' @param assign logical, whether to get the 'x' object's name and
#' prepend it to the string
#' @param DT logical, whether to remove (and wrap with 'setDT')
#' 'internal.selfref' in the dput output
#' @return character length 1, invisibly
#' @export
r2dput <- function(x, assign = TRUE, DT = TRUE) {
out <- deparse1(x)
if (isTRUE(DT)) {
out2 <- gsub(",\\s*\\.internal\\.selfref\\s*=\\s*<pointer:\\s*[0-9a-fx]+>", "", out)
if (out != out2) {
out <- paste0("data.table::as.data.table(", out2, ")")
}
}
if (assign) {
out <- paste(c(as.character(substitute(x)), "<-", out), collapse = " ")
}
if (R.version$os == "mingw32") {
writeLines(out, con = "clipboard")
} else {
clipr::write_clip(out)
}
invisible(out)
}
As an example,
MT <- as.data.table(mtcars[1:3, 11:4])
MT
# carb gear am vs qsec wt drat hp
# <num> <num> <num> <num> <num> <num> <num> <num>
# 1: 4 4 1 0 16.46 2.620 3.90 110
# 2: 4 4 1 0 17.02 2.875 3.90 110
# 3: 1 4 1 1 18.61 2.320 3.85 93
When I want to share that data in a question on SO (or with colleagues), I type in r2dput(MT)
(because, you know, I'm r2evans ;-), I just paste (Ctrl-V
) into the comment/question/answer and get this:
MT <- data.table::as.data.table(structure(list(carb = c(4, 4, 1), gear = c(4, 4, 4), am = c(1, 1, 1), vs = c(0, 0, 1), qsec = c(16.46, 17.02, 18.61), wt = c(2.62, 2.875, 2.32), drat = c(3.9, 3.9, 3.85), hp = c(110, 110, 93)), row.names = c(NA, -3L), class = c("data.table", "data.frame")))
This removes the pointer
that doesn't work, it wraps in as.data.table(.)
making it clear what class I'm really working with, and keeps it all on one line (because I dislike the default dput
method of line-breaks in formatted code blocks).
(Edited to use clipr
on non-windows. I'm not sure it works the same on macos.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With