I am trying to use the data.table package inside my own package. MWE is as follows:
I create a function, test.fun, that simply creates a small data.table object, and then sums the "Val" column grouping by the "A" column. The code is
test.fun<-function () { library(data.table) testdata<-data.table(A=rep(seq(1,5), 5), Val=rnorm(25)) setkey(testdata, A) res<-testdata[,{list(Ct=length(Val),Total=sum(Val),Avg=mean(Val))},"A"] return(res) }
When I create this function in a regular R session, and then run the function, it works as expected.
> res<-test.fun() data.table 1.8.0 For help type: help("data.table") > res A Ct Total Avg [1,] 1 5 -0.5326444 -0.1065289 [2,] 2 5 -4.0832062 -0.8166412 [3,] 3 5 0.9458251 0.1891650 [4,] 4 5 2.0474791 0.4094958 [5,] 5 5 2.3609443 0.4721889
When I put this function into a package, install the package, load the package, and then run the function, I get an error message.
> library(testpackage) > res<-test.fun() data.table 1.8.0 For help type: help("data.table") Error in `[.data.frame`(x, i, j) : object 'Val' not found
Can anybody explain to me why this is happening and what I can do to fix it. Any help is very much appreciated.
Data. table is an extension of data. frame package in R. It is widely used for fast aggregation of large datasets, low latency add/update/remove of columns, quicker ordered joins, and a fast file reader.
If you look at the package listing in the Packages panel, you will find a package called datasets. Simply check the checkbox next to the package name to load the package and gain access to the datasets. You can also click on the package name and RStudio will open a help file describing the datasets in this package.
It provides the efficient data. table object which is a much improved version of the default data. frame . It is super fast and has intuitive and terse syntax.
Andrie's guess is right, +1. There is a FAQ on it (see vignette("datatable-faq")
), as well as a new vignette on importing data.table
:
FAQ 6.9: I have created a package that depends on data.table. How do I ensure my package is data.table-aware so that inheritance from data.frame works?
Either i) include
data.table
in theDepends:
field of your DESCRIPTION file, or ii) includedata.table
in theImports:
field of your DESCRIPTION file ANDimport(data.table)
in your NAMESPACE file.
Further background ... at the top of [.data.table
(and other data.table
functions), you'll see a switch depending on the result of a call to cedta()
. This stands for Calling Environment Data Table Aware. Typing data.table:::cedta
reveals how it's done. It relies on the calling package having a namespace, and, that namespace Import'ing or Depend'ing on data.table
. This is how data.table
can be passed to non-data.table-aware packages (such as functions in base
) and those packages can use absolutely standard [.data.frame
syntax on the data.table
, blissfully unaware that the data.frame
is()
a data.table
, too.
This is also why data.table
inheritance didn't used to be compatible with namespaceless packages, and why upon user request we had to ask authors of such packages to add a namespace to their package to be compatible. Happily, now that R adds a default namespace for packages missing one (from v2.14.0), that problem has gone away :
CHANGES IN R VERSION 2.14.0
* All packages must have a namespace, and one is created on installation if not supplied in the sources.
Here is the complete recipe:
Add data.table
to Imports
in your DESCRIPTION
file.
Add @import data.table
to your respective .R file (i.e., the .R file that houses your function that's throwing the error Error in [.data.frame(x, i, j) : object 'Val' not found
).
Type library(devtools)
and set your working directory to point at the main directory of your R package.
Type document()
. This will ensure that your NAMESPACE
file includes a import(data.table)
line.
Type build()
Type install()
For a nice primer on what build()
and install()
do, see: http://kbroman.org/pkg_primer/.
Then, once you close your R session and login next time, you can immediately jump right in with:
Type library("my_R_package")
Type the name of your function that's housed in the .R file mentioned above.
Enjoy! You should no longer receive the dreaded Error in [.data.frame(x, i, j) : object 'Val' not found
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With