Overriding default hadoop jars in class path

Question

I've seen many manifestations of ways to use the user class path as precedent to the hadoop one. Often times this is done if an m/r job needs a specific version of a library that hadoop coincidentally already uses an older version of (for example jackson's json parser or commons http , etc.)

In any case : I've seen :

mapreduce.task.classpath.user.precedence
mapreduce.task.classpath.first
mapreduce.job.user.classpath.first

Which one of these parameters is the right one to set in my job configuration, in order to force mappers and reducers to have a class path which puts my user defined hadoop_classpath jars BEFORE the hadoop default dependency jars ?

By the way, this is related to this question : Dynamodb requestHandler acception which I recently have found is due to a jar conflict.

Chris White · Accepted Answer

So, assuming you're using 0.20.203, this is handled in the TaskRunner.java code as follows:

The property you're looking for is on line 94 - mapreduce.user.classpath.first
Line 214 is where the call is made to build the list of classpaths, which delegates to a method called getClassPaths(..)
getClassPaths() is defined on line 524, and you should be able to see that the configuration property is used to decide on whether your job + dist cache libraries, or the hadoop libraries go on the classpath first

For other versions of hadoop, you're best to check the TaskRunner.java class to confirm the name of the config property after all this is a "semi hidden config":

static final String MAPREDUCE_USER_CLASSPATH_FIRST =
        "mapreduce.user.classpath.first"; //a semi-hidden config

fengyun · Answer

As in the latest Hadoop version (2.2+), you should set:

    conf.setBoolean(MRJobConfig.MAPREDUCE_JOB_USER_CLASSPATH_FIRST, true);

Overriding default hadoop jars in class path

Tags:

jar

classpath

operator-precedence

hadoop

jayunit100

2 Answers

Chris White

fengyun

Recent Activity

Donate For Us

Overriding default hadoop jars in class path

Tags:

jar

classpath

operator-precedence

hadoop

jayunit100

2 Answers

Chris White

fengyun

Related questions

Recent Activity

Donate For Us