Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Object memory usage with R-reticulate python

Tags:

r

reticulate

I'm wondering how efficiently reticulate handles memory with python objects.

Suppose I have a 5GB pandas dataframe object called data_pandas, in reticulate::python and I'd like to make an analysis with R.

When I call the object from R like py$data_pandas, does it make a copy of this dataframe into R data.frame object internally (i.e. make another 5GB data.frame in R)?

And vice versa (calling R data.frame from python)?

like image 525
Matthew Son Avatar asked Mar 26 '26 06:03

Matthew Son


1 Answers

I'm no expert, but it seems from the vignette on arrays that reticulate makes at least two copies of every python object: "R arrays are only copied to Python when they need to be, otherwise data are shared. Python arrays are always copied when moved into R arrays. This can sometimes lead to three copies of any one array in memory at any one time (at the moment this was written). Future versions will reduce that copy overhead to two." (From https://rstudio.github.io/reticulate/articles/arrays.html)

like image 69
Ethan Bass Avatar answered Mar 27 '26 18:03

Ethan Bass



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!