I am a beginner in apache spark and came across the garbage collection time of tasks in apache spark webUI. Does the execution time of a task include the task garbage collection time?
The answer is yes, the execution that shows in Spark UI of garbage collector is part of total execution time. If your GC is taking more time than the real execution, better you check what you are doing.
If you are facing any problem with the GC, there is a tons of solutions that you can improve the memory usage of Spark, or the GC administration.
According to Databricks blog, the GC execution time is a recursive problem in any big company that use GBs of memory to execute your tasks:
For example, garbage collection takes a long time, causing program to experience long delays, or even crash in severe cases.
You can see the full text here.
Other things that you can see is how to improve or tuning your spark application to avoid the GC time of execution, or GC Overhead Limit or even the OOM errors during execution.
Please check this part of documentation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With