Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to push/pull variables between two instances of R?

Tags:

r

Suppose I have two instances of R running. Are there existing solutions to easily send variables/data from one instance to the other? Maybe even synchronize the values of a variable between the two instances?

For example, first the two instances (R1 & R2) would be connected somehow, then in R1:

> a <- 12
> push(a)

and at this point in R2:

> a
[1] 12

The keyword here is ease of use: make it as quick as possible (for the user) to interactively synchronize the value of certain variables. I would use this with Mathematica's RLink to work interactively in one R instance and push/pull data to/from Mathematica's instance.


I realize that the question might sound strange. The reason why I'm hopeful that something like this exists is that it would be useful for parallel or distributed computing as well (which is not my use case here).

like image 392
Szabolcs Avatar asked Aug 04 '14 23:08

Szabolcs


People also ask

How do I create a new variable from two existing variables in R?

To create a new variable or to transform an old variable into a new one, usually, is a simple task in R. The common function to use is newvariable <- oldvariable . Variables are always added horizontally in a data frame.

How do you assign a data set to a variable in R?

In the R Commander, you can click the Data set button to select a data set, and then click the Edit data set button. For more advanced data manipulation in R Commander, explore the Data menu, particularly the Data / Active data set and Data / Manage variables in active data set menus.


3 Answers

Have a look at svSocket. From the package description at: svSocket.pdf

The SciViews svSocket package provides a stateful, multi-client and preemtive socket server.  [...] 

Although initially designed to server GUI clients, the R socket server can also be used to exchange data between separate R processes.

This demo video is really worth it.

like image 95
flodel Avatar answered Oct 13 '22 01:10

flodel


This is a different approach to the push/pull model, but you can use the bigmemory package to create a matrix that exists in shared memory (or on disk) that can be accessed across multiple R sessions on the same machine:

R session 1

library(bigmemory)
m <- matrix(1:9, 3, 3)
m <- as.big.matrix(m, type="double", backingfile="m.bin", descriptorfile="m.desc")
m
# An object of class "big.matrix"
# Slot "address":
# <pointer: 0x7fba95004ee0>

R session 2

library(bigmemory)
m <- attach.big.matrix("m.desc") 
# Now any changes you make to m will be reflected in both sessions!

This is also useful for parallel computing using on matrices, since you're now only passing around the pointer to the matrix to each of the spawned R sessions, rather than the whole object.

Since we've created a file-backed big matrix, it also allows you to create matrices it also allows you to create and operate on matrices larger than memory!

Parallel example

library(bigmemory)
library(doMC) # Windows users will need to choose a different parallel backend
library(foreach)
registerDoMC(4) # number of cores (new R sessions to spawn) to run in parallel.

m <- matrix(rnorm(1000*1000), 1000)
as.big.matrix(m, type="double", backingfile="m.bin", descriptorfile="m.desc")
# Just to make sure we don't have any of these objects in memory when we spawn the 
# parallel sessions
rm(m)
gc()

foreach(i = 1:4) %dopar% {
  m <- attach.big.matrix("m.desc")
  # do something!
}
like image 10
Scott Ritchie Avatar answered Oct 13 '22 01:10

Scott Ritchie


I think Redis can help you achieve what you want. You can use the R packages rredis and/or RcppRedis

On the first instance of R you can do

library(rredis)
redisConnect()
redisSet("a", 12)
[1] "OK"

Then on the second R instance you can then do

library(rredis)
redisConnect()
redisGet("a")
[1] 12
like image 7
dickoa Avatar answered Oct 13 '22 01:10

dickoa