Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to Profile R Code that Includes SNOW Cluster

I have a nested loop that I'm using foreach, DoSNOW, and a SNOW socket cluster to solve for. How should I go about profiling the code to make sure I'm not doing something grossly inefficient.

Also is there anyway to measure the data flows going between the master and nodes in a Snow cluster?

Thanks,

James

like image 373
James in Ottawa Avatar asked Nov 06 '22 14:11

James in Ottawa


1 Answers

That is an excellent question. From the top of my head, start with a comparison between

  • a serial solution (no snow),
  • a serial solution with snow (to get an idea of overhead) and
  • a parallel solution maybe controlling N to see what type of increase you get.

The never-released-on-CRAN version 0.3.4 of snow also has additional plotting commands that are useful for analysis. You can get it from this directory at Luke Tierney's site.

Real profiling, of course, is hard given the distributed nature.

like image 147
Dirk Eddelbuettel Avatar answered Nov 11 '22 05:11

Dirk Eddelbuettel