Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R data.table: how to go from tibble to data.table to tibble back?

I use mainly tables in the tibble fromat from tidyverse, but for some steps, I use the data.table package. I want to see what is the best way of converting a data.table back to tibble?

I understand that data.table has some clever function setDT and setDF function, that convert from data.frame to data.table (and vice-versa) by reference, i.e. without making a copy.

But what if I wanted to convert back to tibble? Am I copying the data using as_tibble on the data.frame resulting from setDT()? Is there a clever way to use this, maybe using the setattr() from data.table?

library(data.table)
library(tidyverse)

iris_tib <- as_tibble(iris)

## some data.table operation
setDT(iris_tib)
setkey(iris_tib, Species)
iris_tib[, Sepal.Length.Mean := mean(Sepal.Length), by = Species]



## How to convert back to tibble efficiently?
setDF(iris_tib)
iris_tib_back <-  as_tibble(iris_tib)

## it looks like we were able to update by reference? Only rownames were (shallow) copied?
changes(iris_tib, iris_tib_back)
like image 594
Matifou Avatar asked Sep 20 '18 18:09

Matifou


1 Answers

As @Frank mentioned, this was discussed in a post here. One possibility is to use the setattr() function, which set attributes by reference. Precisely:

setattr(x, "class", c("tbl", "tbl_df", "data.frame"))

And if there's a doubt about the original class:

old_class <- class(iris_tib)
setDT(iris_tib)
.... # bunch of data.table operatios
setDF(iris_tib)
setattr(iris_tib, "class", old_class)

This seems to do the necessary job converting back to a tibble.

like image 195
Matifou Avatar answered Nov 01 '22 23:11

Matifou