Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hive Map join : out of memory Exception

Tags:

hive

mapreduce

I am trying to perform map side with one big Table (10G) and small Table (230 MB). With the small i will use all the columns to produce output records, after joining on key columns

I have used below setting

set hive.auto.convert.join=true;

set hive.mapjoin.smalltable.filesize=262144000;

Logs :

**2013-09-20 02:43:50     Starting to launch local task to process map join;      maximum       memory = 1065484288

2013-09-20 02:44:05     Processing rows:        200000  Hashtable size: 199999  Memory usage:   430269904       rate:0.404

2013-09-20 02:44:14     Processing rows:        300000  Hashtable size: 299999  Memory usage:   643070664       rate:0.604

Exception in thread "Thread-0" java.lang.OutOfMemoryError: Java heap space
        at java.util.jar.Manifest$FastInputStream.<init>(Manifest.java:313)
        at java.util.jar.Manifest$FastInputStream.<init>(Manifest.java:308)
        at java.util.jar.Manifest.read(Manifest.java:176)
        at java.util.jar.Manifest.<init>(Manifest.java:50)
        at java.util.jar.JarFile.getManifestFromReference(JarFile.java:168)
        at java.util.jar.JarFile.getManifest(JarFile.java:149)
        at sun.misc.URLClassPath$JarLoader$2.getManifest(URLClassPath.java:696)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:228)
        at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
        at org.apache.hadoop.util.RunJar$1.run(RunJar.java:126)
Execution failed with exit status: 3
Obtaining error information
Task failed!
Task ID:
  Stage-7
Logs:
FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.MapredLocalTask
ATTEMPT: Execute BackupTask: org.apache.hadoop.hive.ql.exec.MapRedTask**

but still i am facing OOM exception , Heap size set in my cluster is 1 GB. Please assist which properties do i need to consider and tune to make this map side join work

like image 767
hjamali52 Avatar asked Sep 20 '13 09:09

hjamali52


2 Answers

I faced this problem and was only able to get over it by using set hive.auto.convert.join=false

like image 105
Run2 Avatar answered Sep 21 '22 20:09

Run2


set hive.auto.convert.join = false; it will not give u memory exception.

like image 28
dilshad Avatar answered Sep 21 '22 20:09

dilshad