I have an Rcpp function inside an R function. The R function produces some object (say a large list) and feeds it to the Rcpp function. Inside the Rcpp function, I process the R object, load the results to a number of C++ classes. Now the R object becomes useless. I want to wipe out the R object to make a memory-sufficient environment for the main algorithms.
The idea is:
// [[Rcpp::export]]
void cppFun(List structuredData)
{
// copy structuredData to C++ classes
// Now I want structuredData gone to save memory
// main algorithms ...
}
/***R
rFun(input)
{
# R creates structuredData from input
cppFun(structuredData)
}
*/
I tried calling R's "rm()" in C++ but it can only identify the object names in R's global environment. For example:
// [[Rcpp::export]]
void cppFun()
{
Language("rm", "globalDat").eval();
Language("gc").eval();
}
/***R
globalDat = 1:10
ls() # shows "globalDat" is created.
cppFun() # shows "globalDat" is no longer in the environment.
ls()
*/
However, the following does not work:
// [[Rcpp::export]]
void cppFun()
{
Language("rm", "localDat").eval();
Language("gc").eval();
}
/***R
rFun <- function (x)
{
locDat = x
ls() // shows "x" and "locDat" are created
cppFun()
ls()
}
globalDat = 1:10
ls() # shows "globalDat" is created.
rFun(globalDat) # it will print "x","locDat" twice and a warning message: In rm("localDat") : object 'localDat' not found
locDat = globalDat
rFun(globalDat) # this will still remove "locDat" from the global environment.
*/
Am I on the right track to the goal? Is there any better way?
Thank you!
Thought of a hacky solution:
Write a shell class wrapping references to all the necessary C++ structured data classes.
In the R function, (i) process the input; (ii) feed the structured R data to the Rcpp function; (iii) in the Rcpp function, new
a shell class object, load the structured R data; (iv) memcpy
the shell class pointer to a double
(8 bytes, if 32-bit system, use int
); (v) return the double
; (vi) return the double
out of the R function. Now the structured R object dies while the new
ed C++ shell object still lives. Call gc()
for garbage collection.
Feed the double
to the main C++/Rcpp function. memcpy
this double to a shell class pointer. delete
the shell class pointer before function returns.
Tests show the above works. Just found "external pointer" or Rcpp::XPtr
designed for a similar purpose?
Doing something along these lines would be known as an antipattern, or highly counterproductive, in Rcpp. Why this is problematic is Rcpp performs a shallow copy when moving an R object to C++, which means the R object shares it's memory allocation with the instantiated C++ object. If you were to remove the R object while a C++ object references it, then you may run into trouble later in the process as a segmentation fault (segfault) would likely occur.
Now, if you intend to do a deep copy from an R object into a C++ structure, this wouldn't be quite as toxic. When doing deep copies, the data does not reference the original R object. However, this is not the default schema for Rcpp.
With this being said, I strongly discourage deleting objects mid-process. If you truly are memory strapped, try "chunking"/dividing the data more, perform operations with a database, buy additional RAM, or wait for ALTREP
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With