Spark Master listens on several ports. Unfortunately the IP address / hostname scheme used differs among them - and it often happens that connections failed.
Then we are left to wonder: how to fix the connection problems: Spark decides on its own how to translate among:
The important consideration: some of the networking clients/connections need an exact string match to successfully contact the master. So in that case 127.0.0.1
is not the same as hostname
. I have seen in cases where hostname
works and hostname.local
does not: that one is a Mac-centric problem. But .. then the former stops working - and I lack the tools to troubleshoot why.
The --master
provides opportunities for confusion on the Linux when you have an internal and external IP address.
Below is an example on my Mac. I see other patterns on AWS and yet other ones on standalone clusters. It is all perplexing and time consuming since it is not clearly documented either:
Below we see output when the --master
option were provided to spark-submit.
--master spark://mellyrn:7077
Notice the variety of ip addresses
http://25.x.x.x:4040
akka.tcp://sparkMaster@mellyrn:7077
mellyrn/127.0.0.1:707
Here is the output on MAC:
15/07/31 12:21:34 INFO SparkEnv: Registering OutputCommitCoordinator
15/07/31 12:21:34 INFO Utils: Successfully started service 'SparkUI' on port 4040.
15/07/31 12:21:34 INFO SparkUI: Started SparkUI at http://25.101.19.24:4040
15/07/31 12:21:34 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster@mellyrn:7077/user/Master...
15/07/31 12:21:35 WARN AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster@mellyrn:7077: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster@mellyrn:7077
15/07/31 12:21:35 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster@mellyrn:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: mellyrn/127.0.0.1:7077
15/07/31 12:21:54 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster@mellyrn:7077/user/Master...
15/07/31 12:21:54 WARN AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster@mellyrn:7077: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster@mellyrn:7077
15/07/31 12:21:54 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster@mellyrn:7077]. Address is now gated for 5000
On Linux the spark connection with --master option does work (though .setMaster() does not reliably). Yet even on linux there is a variety of master/driver strings generated:
Such hosts are called multi-homed hosts. For example, dual-stack hosts are multi-homed because they have both an IPv4 and an IPv6 network address.
No, every host name can have multiple DNS A records pointing to different IPs. This is often used for high-level load balancing. You can check that for example on google.com : $ host google.com google.com has address 209.85.
Once started, the master will print out a spark://HOST:PORT URL for itself, which you can use to connect workers to it, or pass as the “master” argument to SparkContext . You can also find this URL on the master's web UI, which is http://localhost:8080 by default.
The Spark Master is the process that requests resources in the cluster and makes them available to the Spark Driver. In all deployment modes, the Master negotiates resources or containers with Worker nodes or slave nodes and tracks their status and monitors their progress.
The problem was discovered: Spark is binding to a different local interface. I had a VPN client on the 25.X.X.X address - but the hostname pings to 10.X.X. This is a likely bug in spark. I will look into if a JIRA were already submitted for it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With