Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why spark application fail with "executor.CoarseGrainedExecutorBackend: Driver Disassociated"?

When i execute query sql via spark-submit and spark-sql, corresponding spark application always fail with error follows:

15/03/10 18:50:52 INFO util.AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@slave75:60697/user/HeartbeatReceiver
15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave79:35643] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.

and above is just one of the error, i used "yarn logs -application application_1425944520319_8102.log" to obtain the whole application log and screen out the error as below:

Line 46: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave09:55156] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 97: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave09:32852] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 149: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave09:45654] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 200: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave10:45702] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 251: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave10:21596] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 302: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave10:58845] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 353: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave13:1697] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 437: 15/03/10 18:52:06 WARN hdfs.DFSClient: error creating legacy BlockReaderLocal.  Disabling legacy local reads.
Line 481: 15/03/10 18:52:06 ERROR executor.Executor: Exception in task 3.0 in stage 0.0 (TID 10)
Line 504: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave13:6289] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 556: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave14:37070] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 607: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave14:43424] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 658: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave15:38083] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 710: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave15:3106] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 761: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave15:35533] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 812: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave16:63207] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 863: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave16:11250] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 910: 15/03/10 18:52:09 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL 15: SIGTERM
Line 961: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave18:26917] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1012: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave18:3058] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1063: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave19:1885] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1114: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave19:14795] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1165: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave19:39794] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1216: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave20:19614] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1267: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave20:38776] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1318: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave21:19231] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1370: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave21:18816] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1454: 15/03/10 18:52:06 WARN hdfs.DFSClient: error creating legacy BlockReaderLocal.  Disabling legacy local reads.
Line 1498: 15/03/10 18:52:06 ERROR executor.Executor: Exception in task 1.0 in stage 0.0 (TID 18)
Line 1524: 15/03/10 18:52:06 ERROR executor.Executor: Exception in task 1.1 in stage 0.0 (TID 28)
Line 1550: 15/03/10 18:52:06 ERROR executor.Executor: Exception in task 1.2 in stage 0.0 (TID 31)
Line 1576: 15/03/10 18:52:06 ERROR executor.Executor: Exception in task 1.3 in stage 0.0 (TID 32)
Line 1602: 15/03/10 18:52:06 ERROR executor.Executor: Exception in task 1.4 in stage 0.0 (TID 33)
Line 1628: 15/03/10 18:52:07 ERROR executor.Executor: Exception in task 1.5 in stage 0.0 (TID 36)
Line 1654: 15/03/10 18:52:07 ERROR executor.Executor: Exception in task 1.6 in stage 0.0 (TID 37)
Line 1680: 15/03/10 18:52:07 ERROR executor.Executor: Exception in task 1.7 in stage 0.0 (TID 39)
Line 1706: 15/03/10 18:52:07 ERROR executor.Executor: Exception in task 1.8 in stage 0.0 (TID 41)
Line 1732: 15/03/10 18:52:07 ERROR executor.Executor: Exception in task 1.9 in stage 0.0 (TID 42)
Line 1755: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave22:24322] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1806: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave23:38508] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1858: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave24:19707] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1909: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave25:33683] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1976: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave25:18587] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2027: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave26:64531] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2078: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave27:23333] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2129: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave27:61136] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2180: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave27:25118] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2231: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave28:16274] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2282: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave29:1324] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2334: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave29:51664] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2385: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave29:38854] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2452: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave30:30088] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2504: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave30:30778] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2556: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave31:52263] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2623: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave31:17806] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2674: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave32:3251] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2725: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave32:17832] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2776: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave32:11629] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2827: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave33:22629] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2911: 15/03/10 18:52:07 WARN hdfs.DFSClient: error creating legacy BlockReaderLocal.  Disabling legacy local reads.

you can get the log file from https://www.dropbox.com/s/lf50ger18v3ngtb/application_1425944520319_8102.log?dl=0 if i didn't express clearly.

The network of slave75 is ok and hosts in all nodes are correctly configured. Any response will help, thanks!

like image 898
zwb Avatar asked Mar 10 '15 15:03

zwb


People also ask

What mechanism does Spark communicate with driver and executor?

Spark uses a master/slave architecture. As you can see in the figure, it has one central coordinator (Driver) that communicates with many distributed workers (executors). The driver and each of the executors run in their own Java processes.

What is CoarseGrainedExecutorBackend?

CoarseGrainedExecutorBackend is an ExecutorBackend to manage a single coarse-grained executor (that lives as long as the owning executor backend). CoarseGrainedExecutorBackend registers itself as a ThreadSafeRpcEndpoint under the name Executor to communicate with the driver. Note.

How does the spark driver algorithm work?

The driver determines the total number of Tasks by checking the Lineage. The driver creates the Logical and Physical Plan. Once the Physical Plan is generated, Spark allocates the Tasks to the Executors. Task runs on Executor and each Task upon completion returns the result to the Driver.

What is the use of executors in Spark?

The executors are responsible for actually executing the work that the driver assigns them. This means, each executor is responsible for only two things: executing code assigned to it by the driver and reporting the state of the computation, on that executor, back to the driver node.


1 Answers

Finally I found the reason. It is because Yarn kills the executor (container) because the executor is memory overhead. Just turn up values of spark.yarn.driver.memoryOverhead or spark.yarn.executor.memoryOverhead or both.

like image 116
zwb Avatar answered Oct 25 '22 00:10

zwb