This question doesn't refer to distributing jars in the whole cluster for the workers to use them.
It refers to specifying a number of additional libraries on the client machine. To be more specific: I'm trying to run the following command in order to retrieve the contents of a SequenceFile:
/path/to/hadoop/script fs -text /path/in/HDFS/to/my/file
It throws me this error: text: java.io.IOException: WritableName can't load class: util.io.DoubleArrayWritable
I have a writable class called DoubleArrayWritable. In fact , on another computer everything works well.
I tried to set the HADOOP_CLASSPATH
to include the jar containing that class but with no results. Actually, when running:
/path/to/hadoop/script classpath
The result doesn't contain the jar which I added to HADOOP_CLASSPATH.
The question is: how do you specify extra libraries when running hadoop (by extra meaning other libraries than the ones which the hadoop script includes automatically in the classpath)
Some more info which might help:
export HADOOP_CLASSPATH=$HADOOP_HOME/lib
which probably explains why my HADOOP_CLASSPATH env var is ignored.If you are allowed to set HADOOP_CLASSPATH
then
export HADOOP_CLASSPATH=/path/to/jar/myjar.jar:$HADOOP_CLASSPATH; \
hadoop fs -text /path/in/HDFS/to/my/file
will do the job. Since in your case this variable is overridden in hadoop-env.sh
therefore, consider using the -libjars
option instead:
hadoop fs -libjars /path/to/jar/myjar.jar -text /path/in/HDFS/to/my/file
Alternatively invoke FsShell
manually:
java -cp $HADOOP_HOME/lib/*:/path/to/jar/myjar.jar:$CLASSPATH \
org.apache.hadoop.fs.FsShell -conf $HADOOP_HOME/conf/core-site.xml \
-text /path/in/HDFS/to/my/file
If someone wants to check hadoop classpath, enter hadoop classpath
in terminal.
To compile it, use this: javac -cp $(hadoop classpath):path/to/jars/* java_file.java
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With