Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark hangs on union with zero running task

I have two records of type RDD[T]

For example:
val a: RDD[Integer] = ....
val b: RDD[Integer] = ...
when I perform
val z = a.union(b)
println(z)

I find the spark hangs for ever

[Stage 23:=============================> (1 + 0) / 2]

Not sure why it shows 0 running tasks.

Environment:

Spark 1.6

Scala 2.11.6

Total records in a and b is 10 records each. It is a small file.

Did anyone came across this case where running task is zero and the spark hangs and never ends.

like image 893
happybayes Avatar asked Sep 01 '25 05:09

happybayes


1 Answers

Apparently setting

--conf spark.driver.host=127.0.0.1

solved the problem for me.

Let's thanks Melitta Dragaschnig & rrusso2007

Edit: Just wanted to mention that I was facing this problem when doing a union between 2 DataFrames from which one was created by reading from Cassandra (using the DataStax Spark Cassandra Connector)

like image 138
Alexandru T. Avatar answered Sep 02 '25 22:09

Alexandru T.