I have two clusters, one in local virtual machine another in remote cloud. Both clusters in Standalone mode.
My Environment:
Scala: 2.10.4
Spark: 1.5.1
JDK: 1.8.40
OS: CentOS Linux release 7.1.1503 (Core)
The local cluster:
Spark Master: spark://local1:7077
The remote cluster:
Spark Master: spark://remote1:7077
I want to finish this:
Write codes(just simple word-count) in IntelliJ IDEA locally(on my laptp), and set the Spark Master URL to spark://local1:7077
and spark://remote1:7077
, then run my codes in IntelliJ IDEA. That is, I don't want to use spark-submit
to submit a job.
But I got some problem:
When I use the local cluster, everything goes well. Run codes in IntelliJ IDEA or use spark-submit can submit job to cluster and can finish the job.
But When I use the remote cluster, I got a warning log:
TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
It is sufficient resources not sufficient memory!
And this log keep printing, no further actions. Both spark-submit
and run codes in IntelliJ IDEA result the same.
I want to know:
Thanks a lot!
Update
There is a similar question here, but I think my scene is different. When I run my codes in IntelliJ IDEA, and set Spark Master
to local virtual machine cluster, it works. But I got Initial job has not accepted any resources;...
warning instead.
I want to know whether the security policy or fireworks can cause this?
Submitting code programatically (e.g. via SparkSubmit
) is quite tricky. At the least there is a variety of environment settings and considerations -handled by the spark-submit
script - that are quite difficult to replicate within a scala program. I am still uncertain of how to achieve it: and there have been a number of long running threads within the spark developer community on the topic.
My answer here is about a portion of your post: specifically the
TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
The reason is typically there were a mismatch on the requested memory and/or number of cores from your job versus what were available on the cluster. Possibly when submitting from IJ the
$SPARK_HOME/conf/spark-defaults.conf
were not properly matching the parameters required for your task on the existing cluster. You may need to update:
spark.driver.memory 4g
spark.executor.memory 8g
spark.executor.cores 8
You can check the spark ui on port 8080 to verify that the parameters you requested are actually available on the cluster.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With