Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark - Shuffle Read Blocked Time

Lately I've been tuning the performance of some large, shuffle heavy jobs. Looking at the spark UI, I noticed an option called "Shuffle Read Blocked Time" under the additional metrics section.

This "Shuffle Read Blocked Time" seems to account for upwards of 50% of the task duration for a large swath of tasks.

While I can intuit some possibilities for what this means, I can't find any documentation that explains what it actually represents. Needless to say, I also haven't been able to find any resources on mitigation strategies.

Can anyone provide some insight into how I might reduce Shuffle Read Blocked Time?

like image 484
dayman Avatar asked May 26 '16 18:05

dayman


People also ask

What happens when Spark performs a shuffle?

Spark shuffles the mapped data across partitions, some times it also stores the shuffled data into a disk for reuse when it needs to recalculate. Finally runs reduce tasks on each partition based on key.

What is a shuffle block in Spark?

Bunch of shuffle data corresponding to a shuffle reduce task written by a shuffle map task is called a shuffle block. Further, each of the shuffle map tasks informs the driver about the written shuffle data.

What is task Deserialization time in Spark?

Task Deserialization TimeSpark by default uses the Java serializer for object serialization. To enable Kyro serializer, which outperforms the default Java serializer on both time and space, set the spark. serializer parameter to org.


1 Answers

"Shuffle Read Blocked Time" is the time that tasks spent blocked waiting for shuffle data to be read from remote machines. The exact metric it feeds from is shuffleReadMetrics.fetchWaitTime.

Hard to give input into a strategy to mitigate it without actually knowing what data you're trying to read or what sort of remote machines you're reading from. However, consider the following:

  1. Check your connection to the remote machines from which you're reading data.
  2. Check your code/jobs to ensure that you're only reading data that you absolutely need to read to finish your job.
  3. In some cases, you could consider splitting your job into multiple jobs that run in parallel, so long as they are independent of each other.
  4. Perhaps you could upgrade your cluster to have more nodes so you can split the workload to be more granular and thus have an overall smaller wait time.

As to the metrics, this documentation should shed some light on them: https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-webui-StagePage.html

Lastly, i did also find it hard to find information on Shuffle Read Blocked Time, but if you put in quotes like: "Shuffle Read Blocked Time" in a google search, you'll find some decent results.

like image 87
user3124181 Avatar answered Sep 17 '22 12:09

user3124181