When creating a tibble,
tbl <- tibble(A=1:5, B=6:10)
the result of
class(tbl)
is
[1] "tbl_df" "tbl" "data.frame"
I'm used to seeing this as I use dplyr quite a bit. But when is an object just a "tbl" (and not a "tbl_df") or vice versa? I'd just like to know a bit more about the difference, if any.
Any documentation would be much appreciated!
You can think of a "tibble" as an interface. If an object can respond to all the tibble actions, then you can think of it as a tibble. R doesn't have strong typing.
So tbl is the generic tibble, and tbl_df is a specific type of tibble that basically stores it's data in a data.frame.
There are other packages like dtplyr that allow you to act like a tibble but store your data in a data.table. For example
library(dtplyr)
ds <- tbl_dt(mtcars)
class(ds)
# [1] "tbl_dt" "tbl" "data.table" "data.frame"
There's also the dbplyr package which allows you to use a SQL database back end. For example
library(dplyr)
con <- DBI::dbConnect(RSQLite::SQLite(), path = ":memory:")
copy_to(con, mtcars, "mtcars",temporary = FALSE)
cars_db <- tbl(con, "mtcars")
class(cars_db)
# [1] "tbl_dbi" "tbl_sql" "tbl_lazy" "tbl"
So again we see that this thing generally can act as a tibble, but it has other classes that are there so that it can try to do all it's work in the database engine, rather than manipulating the data in R itself.
So there's not really a "difference" between tbl and tbl_df. The latter just says how the tibble is actually being implemented so the behavior can differ (be more optimized).
For more information, you can check out the tibble vignette or the extending tibble vignette
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With