Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Modifying package gbm of R

We are trying to experiment using the gbm package on a quite large dataset (~140 million rows) and we have ran into a problem with the memory requirements of R.

We have tried combining the packages 'gbm' and 'bigmemory' with no success and our next thought was to modify the C++ source code to draw data from a local database where we have stored our dataset.

So, we were wondering if there is a more appropriate or well-known practice in order to change the allocation inside the C++ code of gbm. Has anyone tried something similar?

like image 923
Trifyllenia Avatar asked Jul 27 '12 11:07

Trifyllenia


1 Answers

I’m not familiar with the gbm package, but if it works on data frames or vectors of some kind you could use the ff package.

Quote: The ff package provides data structures that are stored on disk but behave (almost) as if they were in RAM by transparently mapping only a section (pagesize) in main memory...

like image 115
Markus Avatar answered Sep 19 '22 17:09

Markus