Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I get node or process id when using doParallel?

I am running a parallel process on a 12 node cluster.

And was wondering if there is a way to get the node-id or node-number or node-name during a foreach call?

Something like this:

foreach(i = 1:12, .combine=c) %dopar% {node.name()}

This will be helpful in processing the files.

like image 257
Shambho Avatar asked Jul 09 '14 01:07

Shambho


2 Answers

The foreach package doesn't provide any support for a node id or node name, but R has the "sys.info" function, so you could use:

foreach(i = 1:12, .combine=c) %dopar% {
  Sys.info()[['nodename']]
}

To create a unique worker id, you can combine the node name with the process id of the worker:

foreach(i = 1:12, .combine=c) %dopar% {
  paste(Sys.info()[['nodename']], Sys.getpid(), sep='-')
}
like image 115
Steve Weston Avatar answered Nov 04 '22 13:11

Steve Weston


After a lot of trial and error, I found the following to work:

foreach(i = 1:12, .combine=c) %dopar% {
  Sys.getpid()
}

This gives a unique process ID for each of the nodes, which can be used as the node id.

like image 4
Shambho Avatar answered Nov 04 '22 14:11

Shambho