Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace HDFS form local disk to s3 getting error (org.apache.hadoop.service.AbstractService)

We are trying to setup Cloudera 5.5 where HDFS will be working on s3 only for that we have already configured the necessory properties in Core-site.xml

<property>
    <name>fs.s3a.access.key</name>
    <value>################</value>
</property>
<property>
    <name>fs.s3a.secret.key</name>
    <value>###############</value>
</property>
<property>
    <name>fs.default.name</name>
    <value>s3a://bucket_Name</value>
</property>
<property>
    <name>fs.defaultFS</name>
    <value>s3a://bucket_Name</value>
</property>

After setting it up we were able to browse the files for s3 bucket from command

hadoop fs -ls /

And it shows the files available on s3 only.

But when we start the yarn services JobHistory server fails to start with below error and on launching pig jobs we are getting same error

PriviledgedActionException as:mapred (auth:SIMPLE) cause:org.apache.hadoop.fs.UnsupportedFileSystemException: No AbstractFileSystem for scheme: s3a
ERROR   org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils   
Unable to create default file context [s3a://kyvosps]
org.apache.hadoop.fs.UnsupportedFileSystemException: No AbstractFileSystem for scheme: s3a
    at org.apache.hadoop.fs.AbstractFileSystem.createFileSystem(AbstractFileSystem.java:154)
    at org.apache.hadoop.fs.AbstractFileSystem.get(AbstractFileSystem.java:242)
    at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:337)
    at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:334)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)

On serching on Internet we found that we need to set following properties as well in core-site.xml

<property>
  <name>fs.s3a.impl</name>
  <value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
  <description>The implementation class of the S3A Filesystem</description>
</property>
<property>
    <name>fs.AbstractFileSystem.s3a.impl</name>
    <value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
    <description>The FileSystem for  S3A Filesystem</description>
</property>

After setting the above properties we are getting following error

org.apache.hadoop.service.AbstractService   
Service org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager failed in state INITED; cause: java.lang.RuntimeException: java.lang.NoSuchMethodException: org.apache.hadoop.fs.s3a.S3AFileSystem.<init>(java.net.URI, org.apache.hadoop.conf.Configuration)
java.lang.RuntimeException: java.lang.NoSuchMethodException: org.apache.hadoop.fs.s3a.S3AFileSystem.<init>(java.net.URI, org.apache.hadoop.conf.Configuration)
    at org.apache.hadoop.fs.AbstractFileSystem.newInstance(AbstractFileSystem.java:131)
    at org.apache.hadoop.fs.AbstractFileSystem.createFileSystem(AbstractFileSystem.java:157)
    at org.apache.hadoop.fs.AbstractFileSystem.get(AbstractFileSystem.java:242)
    at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:337)
    at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:334)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
    at org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:334)
    at org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:451)
    at org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:473)
    at org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils.getDefaultFileContext(JobHistoryUtils.java:247)

The jars needed for this is in place but still getting the error any help will be great. Thanks in advance

Update

I tried to remove the property fs.AbstractFileSystem.s3a.impl but it give me the same first exception the one i was getting previously which is

org.apache.hadoop.security.UserGroupInformation 
PriviledgedActionException as:mapred (auth:SIMPLE) cause:org.apache.hadoop.fs.UnsupportedFileSystemException: No AbstractFileSystem for scheme: s3a
ERROR   org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils   
Unable to create default file context [s3a://bucket_name]
org.apache.hadoop.fs.UnsupportedFileSystemException: No AbstractFileSystem for scheme: s3a
    at org.apache.hadoop.fs.AbstractFileSystem.createFileSystem(AbstractFileSystem.java:154)
    at org.apache.hadoop.fs.AbstractFileSystem.get(AbstractFileSystem.java:242)
    at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:337)
    at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:334)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
    at org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:334)
    at org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:451)
    at org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:473)
like image 948
Vikas Hardia Avatar asked Dec 16 '15 14:12

Vikas Hardia


People also ask

How to move data from HDFS to S3 in Hadoop?

Finally, we will move the cleansed data to S3 using the DistCp command, which is often used in data movement workflows in Hadoop ecosystem. It provides a distributed copy capability built on top of a MapReduce framework. The below code shows copying data from HDFS location to the S3 bucket.

What is the S3A filesystem in Hadoop?

Since Hadoop 3.1, the S3A FileSystem has been accompanied by classes designed to integrate with the Hadoop and Spark job commit protocols, classes which interact with the S3A filesystem to reliably commit work work to S3: The S3A Committers

What is a Hadoop HDFS cluster?

HDFS is the primary distributed storage used by Hadoop applications. A HDFS cluster primarily consists of a NameNode that manages the file system metadata and DataNodes that store the actual data.

How do I upgrade Hadoop DFS?

HDFS can have one such backup at a time. Before upgrading, administrators need to remove existing backup using bin/hadoop dfsadmin -finalizeUpgrade command. The following briefly describes the typical upgrade procedure: Before upgrading Hadoop software, finalize if there an existing backup.


1 Answers

The problem is not with the location of the jars.

The problem is with the setting:

<property>
    <name>fs.AbstractFileSystem.s3a.impl</name>
    <value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
    <description>The FileSystem for  S3A Filesystem</description>
</property>

This setting is not needed. Because of this setting, it is searching for following constructor in S3AFileSystem class and there is no such constructor:

S3AFileSystem(URI theUri, Configuration conf);

Following exception clearly tells that it is unable to find a constructor for S3AFileSystem with URI and Configuration parameters.

java.lang.RuntimeException: java.lang.NoSuchMethodException: org.apache.hadoop.fs.s3a.S3AFileSystem.<init>(java.net.URI, org.apache.hadoop.conf.Configuration)

To resolve this problem, remove fs.AbstractFileSystem.s3a.impl setting from core-site.xml. Just having fs.s3a.impl setting in core-site.xml should solve your problem.

EDIT: org.apache.hadoop.fs.s3a.S3AFileSystem just implements FileSystem.

Hence, you cannot set value of fs.AbstractFileSystem.s3a.impl to org.apache.hadoop.fs.s3a.S3AFileSystem, since org.apache.hadoop.fs.s3a.S3AFileSystem does not implement AbstractFileSystem.

I am using Hadoop 2.7.0 and in this version s3A is not exposed as AbstractFileSystem.

There is JIRA ticket: https://issues.apache.org/jira/browse/HADOOP-11262 to implement the same and the fix is available in Hadoop 2.8.0.

Assuming, your jar has exposed s3A as AbstractFileSystem, you need to set the following for fs.AbstractFileSystem.s3a.impl:

<property>
    <name>fs.AbstractFileSystem.s3a.impl</name>
    <value>org.apache.hadoop.fs.s3a.S3A</value>
</property>

That will solve your problem.

like image 135
Manjunath Ballur Avatar answered Sep 24 '22 12:09

Manjunath Ballur