I am currently building a custom docker container from a plain distribution with Apache Zeppelin + Spark 2.x inside.
My Spark jobs will run in a remote cluster and I am using yarn-client
as master.
When I run a notebook and try to print sc.version
, the program gets stuck. If I go to the remote resource manager, an application has been created and accepted but in the logs I can read:
INFO yarn.ApplicationMaster: Waiting for Spark driver to be reachable
My understanding of the situation is that the cluster is unable to talk to the driver in the container but I don't know how to solve this issue.
I am currently using the following configuration:
spark.driver.port
set to PORT1
and option -p PORT1:PORT1
passed to the containerspark.driver.host
set to 172.17.0.2
(ip of the container)SPARK_LOCAL_IP
set to 172.17.0.2
(ip of the container)spark.ui.port
set to PORT2
and option -p PORT2:PORT2
passed to the containerI have the feeling I should change the SPARK_LOCAL_IP to the host ip but if I do so, SparkUI is unable to start, blocking the process a step before.
Thanks in advance for any ideas / advices !
Good question! First of all, as you know Apache Zeppelin runs interpreters in a separate processes.
In your case, Spark interpreter JVM process hosts a SparkContext
and serves as aSparkDriver
instance for the yarn-client
deployment mode. This process inside the container, according to the Apache Spark documentation, needs to be able to communicate back and forth to\from YARN ApplicationMaster and all SparkWorkers machines of the cluster.
This implies that you have to have number of ports open and manually forwarded between the container and a host machine. Here is an example of a project at ZEPL doing similar job, where it took us 7 ports to get the job done.
Anoter aproach can be running Docker networking in a host mode (though it apparently does not work on os x, due to a recent bug)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With