When Apache Spark runs in a standalone cluster mode, it uses a number of ports for different types of network communication between (among others) driver and executors/workers.
In spark release 1.1.0 they have added quite a number of properties to allow configuring ports used and also developed a guide for that: http://spark.apache.org/docs/latest/security.html#configuring-ports-for-network-security But it seems one can control only server ports, i.e. the ones being listened.
However, I didn't find the way I can control client ports a spark executor/worker will open to connect to a driver program. My driver program runs in tomcat and I have to be very specific in my catalina.policy to allow only specific IP addresses/ports.
So, is there a way I can control all ports used by Spark to configure socket permissions in catalina.policy of a tomcat running a driver program so that it is able to communicate with executors/workers?
EDIT The error I am getting on tomcat side is:
2014-09-19 16:55:42,437 [New I/O server boss #6] WARN T:[] V:[]o.j.n.c.s.nio.AbstractNioSelector - Failed to accept a connection.
java.security.AccessControlException: access denied ("java.net.SocketPermission" "<worker IP address>:44904" "accept,resolve")
at java.security.AccessControlContext.checkPermission(AccessControlContext.java:372) ~[na:1.7.0_67]
at java.security.AccessController.checkPermission(AccessController.java:559) ~[na:1.7.0_67]
at java.lang.SecurityManager.checkPermission(SecurityManager.java:549) ~[na:1.7.0_67]
at java.lang.SecurityManager.checkAccept(SecurityManager.java:1170) ~[na:1.7.0_67]
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:261) ~[na:1.7.0_67]
at org.jboss.netty.channel.socket.nio.NioServerBoss.process(NioServerBoss.java:100) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42) ~[netty-3.6.6.Final.jar:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_67]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_67]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_67]
port. maxRetries . As per the docs: Maximum number of retries when binding to a port before giving up. When a port is given a specific value (non 0), each subsequent retry will increment the port used in the previous attempt by 1 before retrying.
You can optionally configure the cluster further by setting environment variables in conf/spark-env.sh. Create this file by starting with the conf/spark-env. sh. template, and copy it to all your worker machines for the settings to take effect.
Executor container (it is one JVM) allocates a memory part that consists of three sections. They are Heap memory, Off-Heap memory, and Overhead memory respectively. Off-Heap memory is disabled by default with the property spark. memory.
A client port is typically determined dynamically, at runtime.
The server port is the port that is connected to by the initial client request, as that initial request is being handled, the connection will be "finished" which (among other things) opens a "client" port on the requesting machine to get the reply information. Typically this client port is embedded in the initial request, and is pulled from a range configured in the client's operating system (or at least, the tcp layer of the client's network stack).
If one could configure a client to only offer one port, it would probably introduce issues because when you run two instances of the client program, the subsequent instance would not be able to open its input from the server port, and the first client would get the responses for both the client's requests.
As you are seeing your server fail to open a client (response) port, you likely need to check (in this order)
Odds are you have a garden variety networking issue, but it could be a firewall issue (or an overzealous virus scanner / fire-walling solution).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With