I need to use HDFS cluster from remote desktop through Java API. Everything works OK until it comes to write access. If I'm trying to create any file I receive access permission exception. Path looks good but exception indicates my remote desktop user name which is of course is not what I need to access needed HDFS directory.
The question is: - Is there any way to represent different user name using 'simple' authentication in Java API? - Could you please point some good explanation of authentication / authorization schemes in hadoop / HDFS preferable with Java API examples?
Yes, I already know 'whoami' could be overloaded in this case using shell alias but I prefer to avoid solutions like this. Also specifics here is I dislike usage of some tricks like pipes through SSH and scripts. I'd like to perform everything using just Java API. Thank you in advance.
After some studying I came to the following solution:
Sample code probably useful for people both for 'fake authentication' and remote HDFS access:
package org.myorg;
import java.security.PrivilegedExceptionAction;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.security.UserGroupInformation;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.FileStatus;
public class HdfsTest {
    public static void main(String args[]) {
        try {
            UserGroupInformation ugi
                = UserGroupInformation.createRemoteUser("hbase");
            ugi.doAs(new PrivilegedExceptionAction<Void>() {
                public Void run() throws Exception {
                    Configuration conf = new Configuration();
                    conf.set("fs.defaultFS", "hdfs://1.2.3.4:8020/user/hbase");
                    conf.set("hadoop.job.ugi", "hbase");
                    FileSystem fs = FileSystem.get(conf);
                    fs.createNewFile(new Path("/user/hbase/test"));
                    FileStatus[] status = fs.listStatus(new Path("/user/hbase"));
                    for(int i=0;i<status.length;i++){
                        System.out.println(status[i].getPath());
                    }
                    return null;
                }
            });
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}
Useful reference for those who have a similar problem:
UPDATE:
Alternative for those who uses command line hdfs or hadoop utility without local user needed:
 HADOOP_USER_NAME=hdfs hdfs fs -put /root/MyHadoop/file1.txt /
What you actually do is you read local file in accordance to your local permissions but when placing file on HDFS you are authenticated like user hdfs.
This has pretty similar properties to API code illustrated:
sudo.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With