in the following snip for the application UI, what do the blue blocks in each stage represent?
What do "Exchange" and "WholeStageCodeGen", etc mean?
Where can I find a resource to interpret what spark is doing here?
Many thanks
Whole-Stage Code Generation (aka WholeStageCodegen or WholeStageCodegenExec) fuses multiple operators (as a subtree of plans that support codegen) together into a single Java function that is aimed at improving execution performance.
When you click on a job on the summary page, you see the details page for that job. The details page further shows the event timeline, DAG visualization, and all stages of the job. When you click on a specific job, you can see the detailed information of this job.
Skipped stages are cached stages marked in grey, where computation values are stored in memory and not recomputed after accessing HDFS. A glance at the DAG visualization is enough to know if RDD computations are repeatedly performed or cached stages are used.
Each blue box is the steps of Apache Spark job.
You are asking about the WholeStageCodegen
this stuff is:
Whole-Stage Code Generation (aka WholeStageCodegen or WholeStageCodegenExec) fuses multiple operators (as a subtree of plans that support codegen) together into a single Java function that is aimed at improving execution performance. It collapses a query into a single optimized function that eliminates virtual function calls and leverages CPU registers for intermediate data.
You can see details here SPARK-12795
The exchange means the Shuffle Exchange between jobs in more details:
ShuffleExchange is a unary physical operator. It corresponds to Repartition (with shuffle enabled) and RepartitionByExpression logical operators (as translated in BasicOperators strategy).
All this information you can get in your code using the explain
command
Each step shows you what your dataframe is going to do, this is good to find if your logic is right. If you want more details about Spark UI I suggest you to see this presentation of Spark Summit and read this article about the execution planning.
These information will show you much more about your doubt.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With