I often (but not always) get the following error when running MPI jobs after switching wifi hosts.
Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(498)..............:
MPID_Init(187).....................: channel initialization failed
MPIDI_CH3_Init(89).................:
MPID_nem_init(320).................:
MPID_nem_tcp_init(171).............:
MPID_nem_tcp_get_business_card(418):
MPID_nem_tcp_init(377).............: gethostbyname failed, MacBook-Pro.local (errno 1)
Everything works fine in the coffee shop, and then when I come home, I get the above error. Nothing else has changed.
I've checked the /etc/hosts and /private/etc/hosts files, and they look okay -
##
# Host Database
#
# localhost is used to configure the loopback interface
# when the system is booting. Do not change this entry.
##
127.0.0.1 localhost
255.255.255.255 broadcasthost
I can ping localhost, so the problem isn't exactly that localhost isn't resolved.
Rebooting always fixes the problem, but is there something simple I can do to "reset" my system so that it recognizes local host?
I don't have access to the details of the MPI initialization routines in the code I am running and am not making any explicit calls to gethostname.
I am using MPICH 3.1.4 (built Feb, 2015) and am running OSX 10.10.3
The answer is very simple - here is what seems to work.
I edited the file /etc/hosts
(or /private/etc/hosts
, in OSX) and added the line
127.0.0.1 macbook-pro.local
so now my hosts files looks like :
##
# Host Database
#
# localhost is used to configure the loopback interface
# when the system is booting. Do not change this entry.
##
127.0.0.1 localhost
255.255.255.255 broadcasthost
127.0.0.1 macbook-pro.local
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With