Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does Rmpi require an active Internet connection?

I just installed Rmpi using this tutorial: http://www.stats.uwo.ca/faculty/yu/Rmpi/mac_os_x.htm on Mac OS-X Mountain Lion. I need Rmpi only for making use of all cores and not for deployment on a hardware cluster or similar.

Actually, everything works fine but now I experienced that whenever I don't have an active internet connection (like sitting in the train or just turning wireless of) spawning slaves will fail and I am wondering if this is supposed to work like this?

> require( Rmpi )
> mpi.spawn.Rslaves( nslaves=2 )

--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications.  This means that no Open MPI device has indicated
that it can be used to communicate between these processes.  This is
an error; Open MPI requires that all MPI processes be able to reach
each other.  This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[56132,1],0]) is on host: ABC-MB02
  Process 2 ([[56132,2],0]) is on host: ABC-MB02
  BTLs attempted: self sm

Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
    2 slaves are spawned successfully. 0 failed.
[ABC-MB02:53970] 2 more processes have sent help message help-mca-bml-r2.txt / unreachable proc
[ABC-MB02:53970] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages

Then the load of my CPUs just jumps to 100% and eventually the R session will crash.

Any ideas how I can avoid this behavior? This is my sessionInfo

R version 2.15.2 (2012-10-26)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] graphics  grDevices datasets  stats     utils     methods   base     

other attached packages:
[1] Rmpi_0.6-3     ggplot2_0.9.3  stringr_0.6.2  reshape2_1.2.2 plyr_1.8      

loaded via a namespace (and not attached):
 [1] colorspace_1.2-1   dichromat_2.0-0    digest_0.6.3       grid_2.15.2        gtable_0.1.2       labeling_0.1      
 [7] MASS_7.3-23        munsell_0.4        proto_0.3-10       RColorBrewer_1.0-5 scales_0.2.3       tools_2.15.2
like image 310
Beasterfield Avatar asked Oct 04 '22 08:10

Beasterfield


1 Answers

It doesn't need an internet connection, but Open MPI seems to fail when you spawn processes if you only have the "self" and "sm" BTL's available, and on my Mac laptop, the "tcp" BTL isn't available unless there is at least one network "Connected" in the "Network Preferences". It's interesting to note that I was able to use Rmpi successfully when the workers were started by mpirun, rather than being spawned. Also, I didn't have any problems spawning processes on a Linux machine when it was completely unplugged.

I was able to get my laptop to spawn processes successfully by connecting it to another computer using an ethernet cable, even to a little Raspberry Pi. Even though the cable couldn't actually be used for anything, it tricked the Mac into thinking that "Ethernet" was connected, so the "tcp" BTL was available in Open MPI, so spawning processes worked.

Update:

I finally figured out a solution to this problem that works on my MacBook Pro. By setting the btl_tcp_if_include MCA parameter to "lo0", you can force Open MPI to use the loopback interface for TCP communication when your laptop isn't connected to any external network. One way to set it is with an environment variable in your R script:

Sys.setenv(OMPI_MCA_btl_tcp_if_include='lo0')
library(Rmpi)
mpi.spawn.Rslaves(nslaves=2)

It appears to work as long as you set the environment variable before loading Rmpi, which is when MPI_INIT is called.

like image 174
Steve Weston Avatar answered Oct 07 '22 00:10

Steve Weston