I need to use HDFS cluster from remote desktop through Java API. Everything works OK until it comes to write access. If I'm trying to create any file I receive access permission exception. Path looks good but exception indicates my remote desktop user name which is of course is not what I need to access needed HDFS directory.
The question is: - Is there any way to represent different user name using 'simple' authentication in Java API? - Could you please point some good explanation of authentication / authorization schemes in hadoop / HDFS preferable with Java API examples?
Yes, I already know 'whoami' could be overloaded in this case using shell alias but I prefer to avoid solutions like this. Also specifics here is I dislike usage of some tricks like pipes through SSH and scripts. I'd like to perform everything using just Java API. Thank you in advance.
After some studying I came to the following solution:
Sample code probably useful for people both for 'fake authentication' and remote HDFS access:
package org.myorg;
import java.security.PrivilegedExceptionAction;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.security.UserGroupInformation;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.FileStatus;
public class HdfsTest {
public static void main(String args[]) {
try {
UserGroupInformation ugi
= UserGroupInformation.createRemoteUser("hbase");
ugi.doAs(new PrivilegedExceptionAction<Void>() {
public Void run() throws Exception {
Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://1.2.3.4:8020/user/hbase");
conf.set("hadoop.job.ugi", "hbase");
FileSystem fs = FileSystem.get(conf);
fs.createNewFile(new Path("/user/hbase/test"));
FileStatus[] status = fs.listStatus(new Path("/user/hbase"));
for(int i=0;i<status.length;i++){
System.out.println(status[i].getPath());
}
return null;
}
});
} catch (Exception e) {
e.printStackTrace();
}
}
}
Useful reference for those who have a similar problem:
UPDATE:
Alternative for those who uses command line hdfs
or hadoop
utility without local user needed:
HADOOP_USER_NAME=hdfs hdfs fs -put /root/MyHadoop/file1.txt /
What you actually do is you read local file in accordance to your local permissions but when placing file on HDFS you are authenticated like user hdfs
.
This has pretty similar properties to API code illustrated:
sudo
.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With