In Spark Web UI, there are two DAG visualizations, one for the Job: <img src="https://i.stack.imgur.com/1Xo41.png" alt="enter image description here"> the other for the Stage: <img src="https://i.stack.imgur.com/IWNvE.png" alt="enter image description here"> as explained here. The blog post does explain about the green dots in the Job's DAG, however, it says nothing about those green-shaded boxes in Stage's DAG. Could someone please give a hint? Update: If that also means the code indicated is where data is cached, what can we do to improve the performance?

It is mentioned in the link you provided that <blockquote> <blockquote> Second, one of the RDDs is cached in the first stage (denoted by the green highlight) </blockquote> </blockquote> So the green boxes indicate that they are being cached and future reference to those rdds won't have to be generated from scratch.

What do green-shaded boxes in Spark DAG Visualization mean?

1 Answers

It is mentioned in the link you provided that

Second, one of the RDDs is cached in the first stage (denoted by the green highlight)

So the green boxes indicate that they are being cached and future reference to those rdds won't have to be generated from scratch.

answered Sep 22 '22 14:09

Ramesh Maharjan

Related questions
                            
                                IE 11 on Windows 7 VS IE 11 on Windows 10
                            
                                How to have only unique options in select picker drop down?
                            
                                type of object while implementing interface
                            
                                Hyperledger fabcar sample fabric showing connect failed ERROR
                            
                                Expected condition failed: waiting for visibility of element located by By.xpath
                            
                                Verbose mode on spies in Mockito
                            
                                Wireshark installed but not working on Amazon linux
                            
                                How should I structure dockerized RabbitMQ?
                            
                                Sparklyr - Unable to copy data.frames into Spark using copy_to
                            
                                Android instance app feature module fails to find a layout resource in the same module
                            
                                Win32API Mouse vs Real Mouse Click
                            
                                Why is datekey in fact tables always INT?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What do green-shaded boxes in Spark DAG Visualization mean?

Tags:

FuzzY

People also ask

1 Answers

Ramesh Maharjan

Recent Activity

Donate For Us