Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Monitoring Progress/Debugging Parallel R Scripts

Tags:

foreach

r

Among the choices I have for quickly parallelizing simple code (snowfall, foreach, and so on), what are my options for showing the progress of all the slave processes? Do any of the offerings excel in this regard?

I've seen that snowfall 1.70 has sfCat(), but it doesn't seem to cat output to the master R session.

like image 820
Christopher DuBois Avatar asked Jan 27 '10 06:01

Christopher DuBois


1 Answers

That's where it can turn into black art... I notice that you did not list MPI or PVM -- those old workhorses of parallel computing do have monitors. You may find solutions by going outside of R and relying on job schedulers (slurm, torque, ...)

If you can't do that (and hey, there are reasons why we like the simplicity of snow, foreach, ...) then maybe you can alter your jobs to log a 'heartbeat' or progress message every N steps. You can log to text files (if you have a NFS or SMB/CIFS share), log to a database, or heck, send a tweet with R. It will most likely be specific to your app, and yes, it will have some cost.

like image 187
Dirk Eddelbuettel Avatar answered Oct 04 '22 16:10

Dirk Eddelbuettel