Stock hadoop2.6.0 install gives me no filesystem for scheme: s3n
. Adding hadoop-aws.jar
to the classpath now gives me ClassNotFoundException: org.apache.hadoop.fs.s3a.S3AFileSystem
.
I've got a mostly stock install of hadoop-2.6.0. I've only set directories, and set the following environment variables:
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64/jre
export HADOOP_COMMON_HOME=/opt/hadoop
export HADOOP_HOME=$HADOOP_COMMON_HOME
export HADOOP_HDFS_HOME=$HADOOP_COMMON_HOME
export HADOOP_MAPRED_HOME=$HADOOP_COMMON_HOME
export HADOOP_OPTS=-XX:-PrintWarnings
export PATH=$PATH:$HADOOP_COMMON_HOME/bin
The hadoop classpath
is:
/opt/hadoop/etc/hadoop:/opt/hadoop/share/hadoop/common/lib/*:/opt/hadoop/share/hadoop/common/*:/opt/hadoop/share/hadoop/hdfs:/opt/hadoop/share/hadoop/hdfs/lib/*:/opt/hadoop/share/hadoop/hdfs/*:/opt/hadoop/share/hadoop/yarn/lib/*:/opt/hadoop/share/hadoop/yarn/*:/opt/hadoop/share/hadoop/mapreduce/lib/*:/opt/hadoop/share/hadoop/mapreduce/*:/contrib/capacity-scheduler/*.jar:/opt/hadoop/share/hadoop/tools/lib/*
When I try to run hadoop distcp -update hdfs:///files/to/backup s3n://${S3KEY}:${S3SECRET}@bucket/files/to/backup
I get Error: java.io.Exception, no filesystem for scheme: s3n
. If I use s3a, I get the same error complaining about s3a.
The internet told me that hadoop-aws.jar
is not part of the classpath by default. I added the following line to /opt/hadoop/etc/hadoop/hadoop-env.sh
:
HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HADOOP_COMMON_HOME/share/hadoop/tools/lib/*
and now hadoop classpath
has the following appended to it:
:/opt/hadoop/share/hadoop/tools/lib/*
which should cover /opt/hadoop/share/hadoop/tools/lib/hadoop-aws-2.6.0.jar
. Now I get:
Caused by: java.lang.ClassNotFoundException:
Class org.apache.hadoop.fs.s3a.S3AFileSystem not found
The jar file contains the class that can't be found:
unzip -l /opt/hadoop/share/hadoop/tools/lib/hadoop-aws-2.6.0.jar |grep S3AFileSystem
28349 2014-11-13 21:20 org/apache/hadoop/fs/s3a/S3AFileSystem.class
Is there an order to adding these jars, or am I missing something else critical?
Working from Abhishek's comment on his answer, the only change I needed to make was to mapred-site.xml:
<property>
<!-- Add to the classpath used when running an M/R job -->
<name>mapreduce.application.classpath</name>
<value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*,$HADOOP_MAPRED_HOME/share/hadoop/tools/lib/*</value>
</property>
No changes needed to any other xml or sh files.
You can resolve s3n issue by adding following lines to core-site.xml
<property>
<name>fs.s3n.impl</name>
<value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
<description>The FileSystem for s3n: (Native S3) uris.</description>
</property>
It should work after adding that property.
Edit: If it doesn't resolve your problem then you will have to add the jars in classpath. Can you check if mapred-site.xml has mapreduce.application.classpath: /usr/hdp//hadoop-mapreduce/*. It will include other related jars in classpath :)
In current Hadoop (3.1.1) this approach no longer works. You can fix this by uncommenting the HADOOP_OPTIONAL_TOOLS line in the etc/hadoop/hadoop-env.sh file. Among other tools, this enables the hadoop-aws library.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With