I'm launching a distributed Spark application in YARN client mode, on a Cloudera cluster. After some time I see some errors on Cloudera Manager. Some executors get disconnected and this happens systematically. I would like to debug the issue but the internal exception is not reported by YARN.
Exception from container-launch with container ID: container_1417503665765_0193_01_000003 and exit code: 1
ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:196)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
How can I see the stacktrace of the exception? It seems that YARN reports only that the application exited abnormally. Is there a way to see spark executor log in YARN configuration ?
Standalone mode: Spark executor logs are located in the $SPARK_HOME/work/app-<AppName> directory (where <AppName> is the name of your application). The location also contains stdout/stderr from H2O.
There are two deploy modes that can be used to launch Spark applications on YARN. In cluster mode, the Spark driver runs inside an application master process which is managed by YARN on the cluster, and the client can go away after initiating the application.
When running Spark on YARN, each Spark executor runs as a YARN container. Where MapReduce schedules a container and fires up a JVM for each task, Spark hosts multiple tasks within the same container. This approach enables several orders of magnitude faster task startup time.
Check NodeManager's yarn.nodemanager.log-dir
property. It's the log location of when Spark executor container is running.
Note that when the application finishes NodeManager may remove the files (Log Aggregation). Check this document for detail. http://hortonworks.com/blog/simplifying-user-logs-management-and-access-in-yarn/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With