Some background information:
I'm working with Dataiku DSS, HDFS, and partitioned datasets. I have a particular job running (Hive query) which has two input datasets - one a very large, partitioned dataset, the other a small (~250 rows, 2 columns), non-partitioned dataset. Let's call the partitioned table A, and the non-partitioned table B.
Question:
The query is of the following format,
SELECT a.f1, f2, ..., fn
FROM A as a
LEFT JOIN B as b
ON a.f1 = b.f1
WHERE {PARTITION_FILTER}
Here is the current ouput from the MapReduce job (keep in mind this job is still running):
[09:05:53] [INFO] [dku.utils] - INFO : Total jobs = 4
[09:05:53] [INFO] [dku.utils] - INFO : Starting task [Stage-10:CONDITIONAL] in serial mode
[09:05:53] [INFO] [dku.utils] - INFO : Stage-11 is filtered out by condition resolver.
[09:05:53] [INFO] [dku.utils] - INFO : Stage-1 is selected by condition resolver.
[09:05:53] [INFO] [dku.utils] - INFO : Launching Job 1 out of 4
[09:05:53] [INFO] [dku.utils] - INFO : Starting task [Stage-1:MAPRED] in serial mode
[09:05:53] [INFO] [dku.utils] - INFO : Number of reduce tasks not specified. Estimated from input data size: 307
[09:05:53] [INFO] [dku.utils] - INFO : In order to change the average load for a reducer (in bytes):
[09:05:53] [INFO] [dku.utils] - INFO : set hive.exec.reducers.bytes.per.reducer=<number>
[09:05:53] [INFO] [dku.utils] - INFO : In order to limit the maximum number of reducers:
[09:05:53] [INFO] [dku.utils] - INFO : set hive.exec.reducers.max=<number>
[09:05:53] [INFO] [dku.utils] - INFO : In order to set a constant number of reducers:
[09:05:53] [INFO] [dku.utils] - INFO : set mapreduce.job.reduces=<number>
[09:05:53] [INFO] [dku.utils] - INFO : number of splits:75
[09:05:53] [INFO] [dku.utils] - INFO : Submitting tokens for job: job_1529508387832_10211
[09:05:53] [INFO] [dip.hiveserver2.log.sniffer] - sniffed applicationId => application_1529508387832_10211/
[09:05:53] [INFO] [dku.utils] - INFO : Kill Command = /opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/hadoop/bin/hadoop job -kill job_1529508387832_10211
[09:05:53] [INFO] [dku.utils] - INFO : Hadoop job information for Stage-1: number of mappers: 75; number of reducers: 307
[09:05:53] [INFO] [dku.utils] - INFO : 2018-06-27 09:05:47,749 Stage-1 map = 0%, reduce = 0%
[09:06:48] [INFO] [dku.utils] - INFO : 2018-06-27 09:06:48,444 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 278.18 sec
[09:06:53] [INFO] [dku.utils] - INFO : 2018-06-27 09:06:51,682 Stage-1 map = 1%, reduce = 0%, Cumulative CPU 373.35 sec
[09:07:03] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:00,159 Stage-1 map = 2%, reduce = 0%, Cumulative CPU 501.46 sec
[09:07:03] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:02,235 Stage-1 map = 3%, reduce = 0%, Cumulative CPU 539.6 sec
[09:07:08] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:07,472 Stage-1 map = 4%, reduce = 0%, Cumulative CPU 1389.81 sec
[09:07:13] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:10,605 Stage-1 map = 5%, reduce = 0%, Cumulative CPU 2172.52 sec
[09:07:13] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:12,574 Stage-1 map = 6%, reduce = 0%, Cumulative CPU 2577.81 sec
[09:07:13] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:13,604 Stage-1 map = 8%, reduce = 0%, Cumulative CPU 2865.44 sec
[09:07:18] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:15,747 Stage-1 map = 9%, reduce = 0%, Cumulative CPU 3110.21 sec
[09:07:23] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:19,898 Stage-1 map = 10%, reduce = 0%, Cumulative CPU 4080.2 sec
[09:07:23] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:21,988 Stage-1 map = 11%, reduce = 0%, Cumulative CPU 4522.48 sec
[09:07:23] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:23,015 Stage-1 map = 12%, reduce = 0%, Cumulative CPU 4755.96 sec
[09:07:33] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:29,335 Stage-1 map = 13%, reduce = 0%, Cumulative CPU 5710.85 sec
[09:07:33] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:31,407 Stage-1 map = 14%, reduce = 0%, Cumulative CPU 5948.34 sec
[09:07:38] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:34,555 Stage-1 map = 15%, reduce = 0%, Cumulative CPU 6399.6 sec
[09:07:38] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:37,663 Stage-1 map = 16%, reduce = 0%, Cumulative CPU 6811.22 sec
[09:07:43] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:38,695 Stage-1 map = 19%, reduce = 0%, Cumulative CPU 7087.68 sec
[09:07:43] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:39,729 Stage-1 map = 20%, reduce = 0%, Cumulative CPU 7288.22 sec
[09:07:43] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:40,769 Stage-1 map = 22%, reduce = 0%, Cumulative CPU 7520.54 sec
[09:07:43] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:41,903 Stage-1 map = 24%, reduce = 0%, Cumulative CPU 7771.37 sec
[09:07:43] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:42,930 Stage-1 map = 25%, reduce = 0%, Cumulative CPU 7936.9 sec
[09:07:48] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:45,035 Stage-1 map = 27%, reduce = 0%, Cumulative CPU 8254.78 sec
[09:07:48] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:46,075 Stage-1 map = 29%, reduce = 0%, Cumulative CPU 8428.35 sec
[09:07:48] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:47,111 Stage-1 map = 30%, reduce = 0%, Cumulative CPU 8661.23 sec
[09:07:48] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:48,153 Stage-1 map = 31%, reduce = 0%, Cumulative CPU 8834.37 sec
[09:07:53] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:49,193 Stage-1 map = 33%, reduce = 0%, Cumulative CPU 8983.68 sec
[09:07:53] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:50,227 Stage-1 map = 35%, reduce = 0%, Cumulative CPU 9149.94 sec
[09:07:53] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:51,263 Stage-1 map = 37%, reduce = 0%, Cumulative CPU 9268.9 sec
[09:07:53] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:52,301 Stage-1 map = 38%, reduce = 0%, Cumulative CPU 9415.86 sec
[09:07:53] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:53,352 Stage-1 map = 39%, reduce = 0%, Cumulative CPU 9540.63 sec
[09:07:58] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:54,381 Stage-1 map = 41%, reduce = 0%, Cumulative CPU 9711.54 sec
[09:07:58] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:55,421 Stage-1 map = 43%, reduce = 0%, Cumulative CPU 9823.52 sec
[09:07:58] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:56,453 Stage-1 map = 46%, reduce = 0%, Cumulative CPU 10010.83 sec
[09:07:58] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:57,492 Stage-1 map = 49%, reduce = 0%, Cumulative CPU 10081.9 sec
[09:07:58] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:58,532 Stage-1 map = 53%, reduce = 0%, Cumulative CPU 10230.13 sec
[09:08:03] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:59,576 Stage-1 map = 56%, reduce = 0%, Cumulative CPU 10392.61 sec
[09:08:03] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:00,607 Stage-1 map = 58%, reduce = 0%, Cumulative CPU 10483.38 sec
[09:08:03] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:01,649 Stage-1 map = 63%, reduce = 0%, Cumulative CPU 10618.16 sec
[09:08:03] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:02,672 Stage-1 map = 66%, reduce = 0%, Cumulative CPU 10684.82 sec
[09:08:08] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:03,695 Stage-1 map = 67%, reduce = 0%, Cumulative CPU 10701.95 sec
[09:08:08] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:04,720 Stage-1 map = 70%, reduce = 0%, Cumulative CPU 10767.21 sec
[09:08:08] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:05,750 Stage-1 map = 71%, reduce = 0%, Cumulative CPU 10849.92 sec
[09:08:08] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:06,780 Stage-1 map = 73%, reduce = 0%, Cumulative CPU 10924.45 sec
[09:08:08] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:07,902 Stage-1 map = 76%, reduce = 0%, Cumulative CPU 11000.21 sec
[09:08:13] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:09,965 Stage-1 map = 77%, reduce = 0%, Cumulative CPU 11013.58 sec
[09:08:13] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:10,991 Stage-1 map = 78%, reduce = 0%, Cumulative CPU 11057.76 sec
[09:08:18] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:14,216 Stage-1 map = 80%, reduce = 0%, Cumulative CPU 11157.51 sec
[09:08:18] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:17,362 Stage-1 map = 82%, reduce = 0%, Cumulative CPU 11392.85 sec
[09:08:23] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:20,460 Stage-1 map = 83%, reduce = 0%, Cumulative CPU 11610.7 sec
[09:08:28] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:26,106 Stage-1 map = 84%, reduce = 0%, Cumulative CPU 11781.65 sec
[09:08:28] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:28,163 Stage-1 map = 85%, reduce = 0%, Cumulative CPU 11788.58 sec
[09:08:38] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:35,410 Stage-1 map = 86%, reduce = 0%, Cumulative CPU 12167.24 sec
[09:08:38] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:38,540 Stage-1 map = 86%, reduce = 1%, Cumulative CPU 12317.09 sec
[09:08:43] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:39,583 Stage-1 map = 86%, reduce = 2%, Cumulative CPU 12329.8 sec
[09:08:43] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:40,631 Stage-1 map = 86%, reduce = 3%, Cumulative CPU 12333.61 sec
[09:08:48] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:47,927 Stage-1 map = 87%, reduce = 3%, Cumulative CPU 12651.77 sec
[09:08:53] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:48,970 Stage-1 map = 88%, reduce = 4%, Cumulative CPU 12826.37 sec
[09:08:53] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:51,092 Stage-1 map = 88%, reduce = 5%, Cumulative CPU 12857.19 sec
[09:08:58] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:54,214 Stage-1 map = 90%, reduce = 5%, Cumulative CPU 13037.63 sec
[09:09:03] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:59,396 Stage-1 map = 91%, reduce = 5%, Cumulative CPU 13117.71 sec
[09:09:03] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:00,440 Stage-1 map = 92%, reduce = 5%, Cumulative CPU 13238.06 sec
[09:09:03] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:01,485 Stage-1 map = 93%, reduce = 5%, Cumulative CPU 13249.8 sec
[09:09:08] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:05,660 Stage-1 map = 94%, reduce = 5%, Cumulative CPU 13306.0 sec
[09:09:08] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:06,706 Stage-1 map = 97%, reduce = 5%, Cumulative CPU 13393.5 sec
[09:09:08] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:07,751 Stage-1 map = 97%, reduce = 6%, Cumulative CPU 13409.12 sec
[09:09:08] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:08,795 Stage-1 map = 98%, reduce = 6%, Cumulative CPU 13433.07 sec
[09:09:13] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:09,835 Stage-1 map = 98%, reduce = 8%, Cumulative CPU 13474.03 sec
[09:09:13] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:10,874 Stage-1 map = 98%, reduce = 9%, Cumulative CPU 13484.64 sec
[09:09:18] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:14,004 Stage-1 map = 100%, reduce = 11%, Cumulative CPU 13580.71 sec
[09:09:18] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:15,118 Stage-1 map = 100%, reduce = 13%, Cumulative CPU 13619.15 sec
[09:09:18] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:16,160 Stage-1 map = 100%, reduce = 16%, Cumulative CPU 13707.2 sec
[09:09:18] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:17,210 Stage-1 map = 100%, reduce = 33%, Cumulative CPU 14456.75 sec
[09:09:18] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:18,258 Stage-1 map = 100%, reduce = 39%, Cumulative CPU 14708.07 sec
[09:09:23] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:19,291 Stage-1 map = 100%, reduce = 40%, Cumulative CPU 14768.29 sec
[09:09:23] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:20,329 Stage-1 map = 100%, reduce = 41%, Cumulative CPU 14834.88 sec
[09:09:23] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:21,360 Stage-1 map = 100%, reduce = 42%, Cumulative CPU 14902.4 sec
[09:09:23] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:22,399 Stage-1 map = 100%, reduce = 45%, Cumulative CPU 15040.16 sec
[09:09:23] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:23,433 Stage-1 map = 100%, reduce = 47%, Cumulative CPU 15165.58 sec
[09:09:28] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:28,627 Stage-1 map = 100%, reduce = 63%, Cumulative CPU 15792.29 sec
[09:09:33] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:30,711 Stage-1 map = 100%, reduce = 64%, Cumulative CPU 15889.21 sec
[09:09:33] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:31,753 Stage-1 map = 100%, reduce = 65%, Cumulative CPU 15898.95 sec
[09:09:33] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:32,789 Stage-1 map = 100%, reduce = 66%, Cumulative CPU 15927.26 sec
[09:09:33] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:33,822 Stage-1 map = 100%, reduce = 71%, Cumulative CPU 16086.93 sec
[09:09:38] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:34,866 Stage-1 map = 100%, reduce = 90%, Cumulative CPU 16711.13 sec
[09:09:38] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:35,907 Stage-1 map = 100%, reduce = 93%, Cumulative CPU 16795.34 sec
[09:09:38] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:36,952 Stage-1 map = 100%, reduce = 95%, Cumulative CPU 16881.47 sec
[09:09:38] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:37,995 Stage-1 map = 100%, reduce = 96%, Cumulative CPU 16891.18 sec
[09:09:48] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:44,249 Stage-1 map = 100%, reduce = 97%, Cumulative CPU 16958.21 sec
[09:09:48] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:45,292 Stage-1 map = 100%, reduce = 98%, Cumulative CPU 17011.88 sec
[09:09:48] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:47,378 Stage-1 map = 100%, reduce = 99%, Cumulative CPU 17055.07 sec
[09:10:49] [INFO] [dku.utils] - INFO : 2018-06-27 09:10:47,421 Stage-1 map = 100%, reduce = 99%, Cumulative CPU 17545.89 sec
[09:11:49] [INFO] [dku.utils] - INFO : 2018-06-27 09:11:47,872 Stage-1 map = 100%, reduce = 99%, Cumulative CPU 17764.45 sec
[09:12:49] [INFO] [dku.utils] - INFO : 2018-06-27 09:12:48,287 Stage-1 map = 100%, reduce = 99%, Cumulative CPU 18330.86 sec
[09:13:49] [INFO] [dku.utils] - INFO : 2018-06-27 09:13:48,855 Stage-1 map = 100%, reduce = 99%, Cumulative CPU 19232.58 sec
...
...
...
...
...
[10:22:20] [INFO] [dku.utils] - INFO : 2018-06-27 10:22:17,545 Stage-1 map = 100%, reduce = 99%, Cumulative CPU 26857.28 sec
[10:23:23] [INFO] [dku.utils] - INFO : 2018-06-27 10:23:17,654 Stage-1 map = 100%, reduce = 99%, Cumulative CPU 26974.96 sec
[10:24:18] [INFO] [dku.utils] - INFO : 2018-06-27 10:24:18,112 Stage-1 map = 100%, reduce = 99%, Cumulative CPU 27081.0 sec
[10:25:23] [INFO] [dku.utils] - INFO : 2018-06-27 10:25:18,964 Stage-1 map = 100%, reduce = 99%, Cumulative CPU 27187.52 sec
[10:26:23] [INFO] [dku.utils] - INFO : 2018-06-27 10:26:19,404 Stage-1 map = 100%, reduce = 99%, Cumulative CPU 27287.58 sec
As you can see, the final 1% of the reduce phase is taking quite a while (so far 1h27min of the total 1h32min elapsed).
I'm having trouble finding resources which give a clear and concise explanation to why this is happening. It's also worth mentioning I have a very basic understanding of MapReduce, but am currently in the process of studying MapReduce in an HDFS context. From what I've read, it is possible that the bulk of the reduce task was pinned on one reducer due to the nature of the JOIN
being performed.
If anyone can give a high-level explanation of what is happening, point me in the direction of some good resources, and give tips on how I can avoid this in future jobs by altering my Hive queries, that would be greatly appreciated.
In the event of node failure, before the map output is consumed by the reduce task, Hadoop reruns the map task on another node and re-creates the map output. Reduce task doesn’t work on the concept of data locality.
Reduce phase is the final step in MapReduce Algorithm. Reduce is inherently sequential unless processing multiple tasks. It takes list of <Key, List<Value>> sorted pairs from Shuffle Function and perform reduce operation. Reduce function receives an iterator values from an output list for the specific key.
Note: Map and Reduce are two different processes of the second component of Hadoop, that is, Map Reduce. These are also called phases of Map Reduce. Thus we can say that Map Reduce has two phases. Phase 1 is Map and Phase 2 is Reduce.
MapReduce algorithm is based on sending the processing node (local system) to the place where the data exists. MapReduce Algorithm works by breaking the process into 3 phases. In MapReduce, each phase has key-value pairs as input and output. MapReduce will always expect the input in the form of Key & Value pairs from HDFS layers.
If final reducer is a join then it looks like skew in join key. First of all check two things:
check that b.f1 join key has no duplicates:
select b.f1, count(*) cnt from B b
group by b.f1
having count(*)>1 order by cnt desc;
check the distribution of a.f1:
select a.f1, count(*) cnt from A a
group by a.f1
order by cnt desc
limit 10;
This query will show skewed keys.
If there is a skew (too many rows with the same value) then join skewed keys separately, use union all:
SELECT a.f1, f2, ..., fn
FROM ( select * from A where f1 = skewed_value) as a --skewed
LEFT JOIN B as b
ON a.f1 = b.f1
WHERE {PARTITION_FILTER}
UNION ALL
SELECT a.f1, f2, ..., fn
FROM ( select * from A where f1 != skewed_value) as a --all other
LEFT JOIN B as b
ON a.f1 = b.f1
WHERE {PARTITION_FILTER}
And finally if there is no issues with skew and duplication, then try to increase reducers parallelism: Get current bytes per reducer configuration
set hive.exec.reducers.bytes.per.reducer;
typically this will return some value about 1G. Try to divide by two, set new value before your query and check how many reducers will it start and performance. Success criteria is more reducers has started and performance improved.
set hive.exec.reducers.bytes.per.reducer=67108864;
The less the bytes per reducer the more reducers will be started, increasing parallelism;
UPDATE: Try to enable map-join, your second table is small enough to fit in memory, mapjoin will work without reducers at all and it will be no problem with skew on reducers.
How to enable mapjoin: https://stackoverflow.com/a/49154414/2700344
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With