Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Permission denied when starting spark Command line on AWS EMR cluster

I have launched a cluster with 2 machines (1 master, 1 core) on AWS EMR service with 1 keypairs.

then logged into master instance with ssh provided the created .pem

successed!

then I try to run spark-shell or pyspark on master instance and get the following error

Error initializing SparkContext.
org.apache.hadoop.security.AccessControlException: Permission denied:   user=ec2-user, access=WRITE, inode="/user":hdfs:hadoop:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:271)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:257)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:238)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:179)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6512)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6494)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:6446)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4248)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4218)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4191)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:635)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
like image 787
Hello lad Avatar asked Oct 10 '15 11:10

Hello lad


2 Answers

solved by myself.

ssh with ec2-user would success in logging in, but cause permission error when starting spark

ssh with user hadoop solve this problem

like image 61
Hello lad Avatar answered Dec 22 '22 20:12

Hello lad


To solve this issue you don't have to always ssh as the hadoop user. The shell is trying to access the current users home directory on HDFS.

Running the following terminal commands as the hadoop user (e.g. with su) then allowed me to use spark-shell as my normal user

hdfs dfs -mkdir /user/myuser
hdfs dfs -chown myuser:hadoop /user/myuser

(Replace myuser with the user you want to run the shell as)

like image 36
James k Avatar answered Dec 22 '22 19:12

James k