I'm trying to learn Spark by running it in standalone mode on my MacBook Pro (10.9.2). I downloaded & built it using the instructions here:
http://spark.apache.org/docs/latest/building-spark.html
I then started up the master server using the instructions here:
https://spark.apache.org/docs/latest/spark-standalone.html#starting-a-cluster-manually
That worked, although I had to add the following line to my SPARK_HOME/conf/spark-env.sh file to get it to start successfully:
export SPARK_MASTER_IP="127.0.0.1"
However, when I try to execute this command to start a worker:
./bin/spark-class org.apache.spark.deploy.worker.Worker spark://127.0.0.1:7077
It fails with this error:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/01/26 16:29:17 INFO Worker: Registered signal handlers for [TERM, HUP, INT]
15/01/26 16:29:17 INFO SecurityManager: Changing view acls to: erioconnor
15/01/26 16:29:17 INFO SecurityManager: Changing modify acls to: erioconnor
15/01/26 16:29:17 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(erioconnor); users with modify permissions: Set(erioconnor)
15/01/26 16:29:17 INFO Slf4jLogger: Slf4jLogger started
15/01/26 16:29:17 INFO Remoting: Starting remoting
15/01/26 16:29:17 ERROR NettyTransport: failed to bind to /10.252.181.130:0, shutting down Netty transport
15/01/26 16:29:17 ERROR Remoting: Remoting error: [Startup failed] [
akka.remote.RemoteTransportException: Startup failed
at akka.remote.Remoting.akka$remote$Remoting$$notifyError(Remoting.scala:136)
at akka.remote.Remoting.start(Remoting.scala:201)
at akka.remote.RemoteActorRefProvider.init(RemoteActorRefProvider.scala:184)
at akka.actor.ActorSystemImpl.liftedTree2$1(ActorSystem.scala:618)
at akka.actor.ActorSystemImpl._start$lzycompute(ActorSystem.scala:615)
at akka.actor.ActorSystemImpl._start(ActorSystem.scala:615)
at akka.actor.ActorSystemImpl.start(ActorSystem.scala:632)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:141)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:118)
at org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:121)
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:54)
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:53)
at org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1676)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1667)
at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:56)
at org.apache.spark.deploy.worker.Worker$.startSystemAndActor(Worker.scala:495)
at org.apache.spark.deploy.worker.Worker$.main(Worker.scala:475)
at org.apache.spark.deploy.worker.Worker.main(Worker.scala)
Caused by: org.jboss.netty.channel.ChannelException: Failed to bind to: /10.252.181.130:0
at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)
at akka.remote.transport.netty.NettyTransport$$anonfun$listen$1.apply(NettyTransport.scala:393)
at akka.remote.transport.netty.NettyTransport$$anonfun$listen$1.apply(NettyTransport.scala:389)
at scala.util.Success$$anonfun$map$1.apply(Try.scala:206)
at scala.util.Try$.apply(Try.scala:161)
at scala.util.Success.map(Try.scala:206)
at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:235)
at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:235)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.processBatch$1(BatchingExecutor.scala:67)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:82)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:58)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.net.BindException: Can't assign requested address
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:444)
at sun.nio.ch.Net.bind(Net.java:436)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.jboss.netty.channel.socket.nio.NioServerBoss$RegisterTask.run(NioServerBoss.java:193)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:372)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:296)
at org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
]
15/01/26 16:29:17 WARN Utils: Service 'sparkWorker' could not bind on port 0. Attempting port 1.
15/01/26 16:29:17 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/01/26 16:29:17 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
15/01/26 16:29:17 INFO Remoting: Remoting shut down
15/01/26 16:29:17 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
The error repeats a grand total of 16 times before the process gives up.
I don't understand the reference to IP address 10.252.181.130. It doesn't appear anywhere in the Spark code/config that I've been able to locate, and Googling it doesn't turn up any results. I noticed this line in the log file from when I started the master server:
15/01/26 16:27:18 INFO MasterWebUI: Started MasterWebUI at http://10.252.181.130:8080
I looked at the Scala source for MasterWebUI (and WebUI, which it extends), and noticed that they seem to get that IP address from an environment variable called SPARK_PUBLIC_DNS. I tried setting that variable to 127.0.0.1 in my spark-env.sh script, but that didn't work. In fact, it prevented my master server from starting:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/01/26 19:46:16 INFO Master: Registered signal handlers for [TERM, HUP, INT]
Exception in thread "main" java.net.UnknownHostException: LM-PDX-00871419: LM-PDX-00871419: nodename nor servname provided, or not known
at java.net.InetAddress.getLocalHost(InetAddress.java:1473)
at org.apache.spark.util.Utils$.findLocalIpAddress(Utils.scala:620)
at org.apache.spark.util.Utils$.localIpAddress$lzycompute(Utils.scala:612)
at org.apache.spark.util.Utils$.localIpAddress(Utils.scala:612)
at org.apache.spark.util.Utils$.localIpAddressHostname$lzycompute(Utils.scala:613)
at org.apache.spark.util.Utils$.localIpAddressHostname(Utils.scala:613)
at org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665)
at org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.util.Utils$.localHostName(Utils.scala:665)
at org.apache.spark.deploy.master.MasterArguments.<init>(MasterArguments.scala:27)
at org.apache.spark.deploy.master.Master$.main(Master.scala:819)
at org.apache.spark.deploy.master.Master.main(Master.scala)
Caused by: java.net.UnknownHostException: LM-PDX-00871419: nodename nor servname provided, or not known
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293)
at java.net.InetAddress.getLocalHost(InetAddress.java:1469)
... 12 more
(Note that LM-PDX-00871419 is the hostname of my Mac.)
So at this point I'm kinda stumped. I'd greatly appreciate any suggestions as to where to look next.
If you use IntelliJ IDEA, it might be helpful to know that one can set the environment variable SPARK_LOCAL_IP to 127.0.0.1 in Run/Debug configurations.
I've had trouble using the loopback address, 127.0.0.1
or localhost
. I've had better success with the actual public IP address of the machine, e.g., 192.168.1.101
. I don't know where the 10.252.181.130
is coming from, either.
For your Mac's host name, try adding ".local" to end. If that doesn't work, add an entry to /etc/hosts
.
Finally, I recommend using the sbin/spark-master.sh
and sbin/spark-slave.sh
scripts for running these services.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With