Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

apache spark: local[K] master URL - job gets stuck

I am using apache spark 0.8.0 to process a large data file and perform some basic .map and .reduceByKey operations on the RDD.

Since I am using a single machine with multiple processors, I mention local[8] in the Master URL field while creating SparkContext

val sc = new SparkContext("local[8]", "Tower-Aggs", SPARK_HOME ) 

But whenever I mention multiple processors, the job gets stuck (pauses/halts) randomly. There is no definite place where it gets stuck, its just random. Sometimes it won't happen at all. I am not sure if it continues after that but it gets stuck for a long time after which I abort the job.

But when I just use local in place of local[8], the job runs seamlessly without getting stuck ever.

val sc = new SparkContext("local", "Tower-Aggs", SPARK_HOME )

I am not able to understand where is the problem.

I am using Scala 2.9.3 and sbt to build and run the application

like image 379
Vijay Avatar asked Nov 25 '13 07:11

Vijay


Video Answer


1 Answers

I'm using spark 1.0.0 and met the same problem: if a function passed to a transformation or action wait/loop indefinitely, then spark won't wake it or terminate/retry it by default, in which case you can kill the task.

However, a recent feature (speculative task) allows spark to start replicated tasks if a few tasks take much longer than average running time of their peers. This can be enabled and configured in the following config properties:

  • spark.speculation false If set to "true", performs speculative execution of tasks. This means if one or more tasks are running slowly in a stage, they will be re-launched.

  • spark.speculation.interval 100 How often Spark will check for tasks to speculate, in milliseconds.

  • spark.speculation.quantile 0.75 Percentage of tasks which must be complete before speculation is enabled for a particular stage.

  • spark.speculation.multiplier 1.5 How many times slower a task is than the median to be considered for speculation.

(source: http://spark.apache.org/docs/latest/configuration.html)

like image 53
tribbloid Avatar answered Oct 01 '22 10:10

tribbloid