I have a Hive query which is selecting about 30 columns and around 400,000 records and inserting them into another table. I have one join in my SQL clause, which is just an inner join.
The query fails because of a Java GC overhead limit exceeded.
What's strange is that if I remove the join clause and just select the data from the table (slightly higher volume) then the query works fine.
I'm pretty new to Hive. I can't understand why this join is causing memory exceptions.
Is there something that I should be aware of with regards to how I write Hive queries so that they don't cause these issues? Could anyone explain why the join might cause this issue but selecting a higher volume of data and the same number of columns does not.
Appreciate your thoughts on this. Thanks
Depending on the version of Hive and your configuration, the answer to your question may vary. It would be easier if you could share your exact query along with the create statements of the two tables and an estimate of their sizes.
To get a better understanding of the problem, let's go through how a "regular" inner join works in Hive.
Hive join in MapReduce:
Here is a simplified description of how an inner join in Hive gets compiled to MapReduce. In general, if you have two tables t1 and t2 with a join query like:
SELECT
t1.key, t1.value, t2.value
FROM
t1
JOIN
t2 (ON t1.key = t2.key);
Where, t1 has the following contents:
k_1 v1_1
k_2 v1_2
k_3 v1_3
Where, t2 has the following contents:
k_2 v2_2
k_3 v2_3
k_4 v2_4
We would expect the join result to be
k_2 v1_2 v2_2
k_3 v1_3 v2_3
Assuming the tables are stored on HDFS, their contents will be split up into File Splits. A mapper will take a file split as input and emit out the key as the key column of the table and the value as the composite of the value column of the table and a flag (representing which table the record is from i.e. t1 or t2).
For t1:
k_1, <v1_1, t1>
k_2, <v1_2, t1>
k_3, <v1_3, t1>
For t2:
k_2, <v2_2, t2>
k_3, <v2_3, t2>
k_4, <v2_4, t2>
Now, these emitted out records go through the shuffle phase where all the records with the same keys are grouped together and sent to a reducer. The context of each reduce operation is one key and a list containing all the values corresponding to that key. In practice, one reducer will perform several reduce operations.
In the above example, we would get the following groupings:
k_1, <<v1_1, t1>>
k_2, <<v1_2, t1>, <v2_2, t2>>
k_3, <<v1_3, t1>, <v2_3, t2>>
k_4, <<v2_4, t2>>
Here is what happens in the reducer. For each of the values in the list of values, the reducer will perform a multiplication if the values correspond to different tables.
For k_1, there is no value from t2 and nothing is emitted.
For k_2, a multiplication of values is emitted - k_2, v1_2, v2_2 (since there is one value from each table, 1x1 = 1)
For k_3, a multiplication of values is emitted - k_3, v1_3, v2_3 (since there is one value from each table, 1x1 = 1)
For k_4, there is no value from t1 and nothing is emitted. Hence you obtain the result that you expected from your inner join.
Ok, so what do I do?
It's possible that there is skew in your data. In other words, when the reducer gets the data, the list of values corresponding to some key is very long which causes an error.
To alleviate the problem, you may try bumping up the memory available to your JVM. You can do so by setting mapred.child.java.opts
to a value like -Xmx512M
in your hive-site.xml. You can query the present value of this parameter by doing set mapred.child.java.opts;
in your Hive shell.
You can try using alternatives to "regular" join, e.g. map join. The above explanation of joins applies to regular joins where the joining happens in reducers. Depending on the version of Hive you are using, Hive may automatically be able to convert a regular join to map join which is faster (because the join happens in map phase). To enable the optimization, set hive.auto.convert.join
to true
. This property was introduced in Hive 0.7
In addition to setting hive.auto.convert.join
to true
, you may also set hive.optimize.skewjoin
to true
. This will work around the skew in your data problem described in 1.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With