I would like to change the user used in hdfs compared to the one that is used in the jvm because i have this error :
Stream spark: org.apache.hadoop.security.AccessControlException: Permission denied: user=www, access=WRITE, node="/user/www/.sparkStaging/application_1460635834146_0012":hdfs:hdfs:drwxr-xr-x
And i want to change the user "www" to another like "joe" who had the permission to write. ( i haven't a folder "user/www", but i have "user/joe" )
Here is my java code :
LOGGER.debug("start submitSparkJob");
Process spark;
SparkLauncher sl;
try {
sl = new SparkLauncher()
.setAppName(argsMap.get(SparkParametersEnum.NAME))
.setSparkHome(argsMap.get(SparkParametersEnum.SPARK_HOME))
.setAppResource(argsMap.get(SparkParametersEnum.JAR))
.setMainClass(argsMap.get(SparkParametersEnum.CLASS))
.addAppArgs(argsMap.get(SparkParametersEnum.ARG))
.setMaster(argsMap.get(SparkParametersEnum.MASTER))
.setDeployMode(argsMap.get(SparkParametersEnum.DEPLOY_MODE))
.setConf(SparkLauncher.DRIVER_MEMORY, "2g")
.setVerbose(true);
if(argsMap.containsKey(SparkParametersEnum.STAGING_DIR)){
sl.setConf("spark.yarn.stagingDir", argsMap.get(SparkParametersEnum.STAGING_DIR));
}
if(argsMap.containsKey(SparkParametersEnum.ACCESS_NAMENODES)){
sl.setConf("spark.yarn.access.namenodes", argsMap.get(SparkParametersEnum.ACCESS_NAMENODES));
}
if(argsMap.containsKey(SparkParametersEnum.PRINCIPAL)){
sl.setConf("spark.yarn.principal", argsMap.get(SparkParametersEnum.PRINCIPAL));
}
if(argsMap.containsKey(SparkParametersEnum.DIST_JAR)){
sl.setConf("spark.yarn.dist.jars", argsMap.get(SparkParametersEnum.DIST_JAR));
}
LOGGER.debug("SparkLauncher set");
spark = sl.launch();
LOGGER.debug("SparkLauncher launched");
I tried :
But none worked
Here you can see the strak trace:
15 Feb 2017 15:36:22,794 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: Parsed arguments:
15 Feb 2017 15:36:22,794 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: master yarn//*****
15 Feb 2017 15:36:22,795 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: deployMode cluster
15 Feb 2017 15:36:22,795 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: executorMemory null
15 Feb 2017 15:36:22,795 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: executorCores null
15 Feb 2017 15:36:22,795 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: totalExecutorCores null
15 Feb 2017 15:36:22,795 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: propertiesFile /usr/hdp/2.3.0.0-2557/spark/conf/spark-defaults.conf
15 Feb 2017 15:36:22,796 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: driverMemory 2g
15 Feb 2017 15:36:22,796 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: driverCores null
15 Feb 2017 15:36:22,796 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: driverExtraClassPath null
15 Feb 2017 15:36:22,796 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: driverExtraLibraryPath null
15 Feb 2017 15:36:22,796 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: driverExtraJavaOptions null
15 Feb 2017 15:36:22,796 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: supervise false
15 Feb 2017 15:36:22,797 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: queue null
15 Feb 2017 15:36:22,797 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: numExecutors null
15 Feb 2017 15:36:22,797 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: files null
15 Feb 2017 15:36:22,797 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: pyFiles null
15 Feb 2017 15:36:22,797 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: archives null
15 Feb 2017 15:36:22,797 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: mainClass **********.ExtractLauncher
15 Feb 2017 15:36:22,798 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: primaryResource file:/usr/*****/MyJar.jar
15 Feb 2017 15:36:22,798 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: name mySparkApp
15 Feb 2017 15:36:22,798 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: childArgs [application-context.xml -s "2017-02-08" -e "2017-02-08" -t "******" -te "*****"]
15 Feb 2017 15:36:22,798 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: jars null
15 Feb 2017 15:36:22,798 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: packages null
15 Feb 2017 15:36:22,798 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: repositories null
15 Feb 2017 15:36:22,799 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: verbose true
15 Feb 2017 15:36:22,799 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:
15 Feb 2017 15:36:22,799 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: Spark properties used, including those specified through
15 Feb 2017 15:36:22,800 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: --conf and those from the properties file /usr/hdp/2.3.0.0-2557/spark/conf/spark-defaults.conf:
15 Feb 2017 15:36:22,800 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.queue -> default
15 Feb 2017 15:36:22,801 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.local.dir -> /hadoop/spark
15 Feb 2017 15:36:22,801 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.history.kerberos.principal -> none
15 Feb 2017 15:36:22,802 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.driver.memory -> 2g
15 Feb 2017 15:36:22,802 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.max.executor.failures -> 3
15 Feb 2017 15:36:22,802 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.historyServer.address -> ********:*****
15 Feb 2017 15:36:22,803 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.services -> org.apache.spark.deploy.yarn.history.YarnHistoryService
15 Feb 2017 15:36:22,803 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.history.ui.port -> *****
15 Feb 2017 15:36:22,804 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.history.provider -> org.apache.spark.deploy.yarn.history.YarnHistoryProvider
15 Feb 2017 15:36:22,804 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.scheduler.heartbeat.interval-ms -> 5000
15 Feb 2017 15:36:22,805 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.submit.file.replication -> 3
15 Feb 2017 15:36:22,805 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.executor.memoryOverhead -> 384
15 Feb 2017 15:36:22,805 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.containerLauncherMaxThreads -> 25
15 Feb 2017 15:36:22,806 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.driver.memoryOverhead -> 384
15 Feb 2017 15:36:22,806 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.history.kerberos.keytab -> none
15 Feb 2017 15:36:22,807 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.preserve.staging.files -> false
15 Feb 2017 15:36:22,807 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:
15 Feb 2017 15:36:22,808 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:
15 Feb 2017 15:36:22,814 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: Main class:
15 Feb 2017 15:36:22,814 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: org.apache.spark.deploy.yarn.Client
15 Feb 2017 15:36:22,815 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: Arguments:
15 Feb 2017 15:36:22,815 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: --name
15 Feb 2017 15:36:22,815 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: mySparkApp
15 Feb 2017 15:36:22,815 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: --driver-memory
15 Feb 2017 15:36:22,815 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: 2g
15 Feb 2017 15:36:22,815 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: --jar
15 Feb 2017 15:36:22,816 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: file:/usr/***/MyJar.jar
15 Feb 2017 15:36:22,816 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: --class
15 Feb 2017 15:36:22,816 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: **********.ExtractLauncher
15 Feb 2017 15:36:22,816 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: --arg
15 Feb 2017 15:36:22,816 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: application-context.xml -s "2017-02-08" -e "2017-02-08" -t "******" -te "******"
15 Feb 2017 15:36:22,817 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: System properties:
15 Feb 2017 15:36:22,817 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.queue -> default
15 Feb 2017 15:36:22,817 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.local.dir -> /hadoop/spark
15 Feb 2017 15:36:22,817 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.history.kerberos.principal -> none
15 Feb 2017 15:36:22,817 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.driver.memory -> 2g
15 Feb 2017 15:36:22,818 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.max.executor.failures -> 3
15 Feb 2017 15:36:22,818 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.historyServer.address -> ******:*****
15 Feb 2017 15:36:22,818 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.services -> org.apache.spark.deploy.yarn.history.YarnHistoryService
15 Feb 2017 15:36:22,818 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.history.ui.port -> *****
15 Feb 2017 15:36:22,818 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: SPARK_SUBMIT -> true
15 Feb 2017 15:36:22,818 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.history.provider -> org.apache.spark.deploy.yarn.history.YarnHistoryProvider
15 Feb 2017 15:36:22,818 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.app.name -> mySparkApp
15 Feb 2017 15:36:22,819 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.executor.memoryOverhead -> 384
15 Feb 2017 15:36:22,819 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.submit.file.replication -> 3
15 Feb 2017 15:36:22,819 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.scheduler.heartbeat.interval-ms -> 5000
15 Feb 2017 15:36:22,819 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.driver.memoryOverhead -> 384
15 Feb 2017 15:36:22,819 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.containerLauncherMaxThreads -> 25
15 Feb 2017 15:36:22,820 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.history.kerberos.keytab -> none
15 Feb 2017 15:36:22,820 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.preserve.staging.files -> false
15 Feb 2017 15:36:22,821 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.master -> yarn-cluster
15 Feb 2017 15:36:22,821 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: Classpath elements:
15 Feb 2017 15:36:22,821 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:
15 Feb 2017 15:36:22,821 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:
15 Feb 2017 15:36:22,821 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:
15 Feb 2017 15:36:23,275 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: 17/02/15 15:36:23 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15 Feb 2017 15:36:23,796 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: 17/02/15 15:36:23 INFO RMProxy: Connecting to ResourceManager at *********:*******
15 Feb 2017 15:36:24,030 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: 17/02/15 15:36:24 INFO Client: Requesting a new application from cluster with 1 NodeManagers
15 Feb 2017 15:36:24,043 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: 17/02/15 15:36:24 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (5120 MB per container)
15 Feb 2017 15:36:24,044 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: 17/02/15 15:36:24 INFO Client: Will allocate AM container, with 2432 MB memory including 384 MB overhead
15 Feb 2017 15:36:24,045 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: 17/02/15 15:36:24 INFO Client: Setting up container launch context for our AM
15 Feb 2017 15:36:24,046 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: 17/02/15 15:36:24 INFO Client: Preparing resources for our AM container
15 Feb 2017 15:36:24,364 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: 17/02/15 15:36:24 WARN DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
15 Feb 2017 15:36:24,402 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: Error: application failed with exception
15 Feb 2017 15:36:24,402 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: org.apache.hadoop.security.AccessControlException: Permission denied: user=www, access=WRITE, inode="/user/www/.sparkStaging/application_1460635834146_0012":hdfs:hdfs:drwxr-xr-x
15 Feb 2017 15:36:24,402 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)
15 Feb 2017 15:36:24,402 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292)
15 Feb 2017 15:36:24,403 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:213)
15 Feb 2017 15:36:24,403 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
15 Feb 2017 15:36:24,403 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1698)
15 Feb 2017 15:36:24,403 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1682)
15 Feb 2017 15:36:24,403 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1665)
15 Feb 2017 15:36:24,403 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:71)
15 Feb 2017 15:36:24,404 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3895)
15 Feb 2017 15:36:24,404 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:983)
15 Feb 2017 15:36:24,404 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:622)
15 Feb 2017 15:36:24,404 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
15 Feb 2017 15:36:24,404 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
15 Feb 2017 15:36:24,404 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
15 Feb 2017 15:36:24,404 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2081)
15 Feb 2017 15:36:24,405 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2077)
15 Feb 2017 15:36:24,410 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at java.security.AccessController.doPrivileged(Native Method)
15 Feb 2017 15:36:24,410 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at javax.security.auth.Subject.doAs(Subject.java:422)
15 Feb 2017 15:36:24,410 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
15 Feb 2017 15:36:24,411 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2075)
15 Feb 2017 15:36:24,411 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:
15 Feb 2017 15:36:24,414 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
15 Feb 2017 15:36:24,414 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
15 Feb 2017 15:36:24,414 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
15 Feb 2017 15:36:24,414 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
15 Feb 2017 15:36:24,414 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
15 Feb 2017 15:36:24,415 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
15 Feb 2017 15:36:24,415 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:3010)
15 Feb 2017 15:36:24,415 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2978)
15 Feb 2017 15:36:24,415 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1047)
15 Feb 2017 15:36:24,415 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1043)
15 Feb 2017 15:36:24,415 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
15 Feb 2017 15:36:24,416 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:1043)
15 Feb 2017 15:36:24,416 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:1036)
15 Feb 2017 15:36:24,416 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1877)
15 Feb 2017 15:36:24,416 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:598)
15 Feb 2017 15:36:24,416 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:224)
15 Feb 2017 15:36:24,416 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:384)
15 Feb 2017 15:36:24,416 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:102)
15 Feb 2017 15:36:24,417 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.spark.deploy.yarn.Client.run(Client.scala:619)
15 Feb 2017 15:36:24,417 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.spark.deploy.yarn.Client$.main(Client.scala:647)
15 Feb 2017 15:36:24,417 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.spark.deploy.yarn.Client.main(Client.scala)
15 Feb 2017 15:36:24,417 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
15 Feb 2017 15:36:24,417 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
15 Feb 2017 15:36:24,417 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
15 Feb 2017 15:36:24,417 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at java.lang.reflect.Method.invoke(Method.java:497)
15 Feb 2017 15:36:24,421 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:577)
15 Feb 2017 15:36:24,421 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:174)
15 Feb 2017 15:36:24,421 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:197)
15 Feb 2017 15:36:24,422 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
15 Feb 2017 15:36:24,422 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
15 Feb 2017 15:36:24,422 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=www, access=WRITE, inode="/user/www/.sparkStaging/application_1460635834146_0012":hdfs:hdfs:drwxr-xr-x
15 Feb 2017 15:36:24,422 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)
15 Feb 2017 15:36:24,422 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292)
15 Feb 2017 15:36:24,422 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:213)
15 Feb 2017 15:36:24,422 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
15 Feb 2017 15:36:24,422 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1698)
15 Feb 2017 15:36:24,423 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1682)
15 Feb 2017 15:36:24,432 [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: ... 33 more
If someone has an idea :) Thanks !
Apache Spark / PySpark The spark-submit command is a utility to run or submit a Spark or PySpark application program (or job) to the cluster by specifying options and configurations, the application you are submitting can be written in Scala, Java, or Python (PySpark). spark-submit command supports the following.
HDFS as a file system is somewhat similar to the POSIX file system in terms of the file permissions it requires. However, HDFS doesn’t have the concept of users and groups as in the other file systems. It’s important to understand the nature of the HDFS super user and how to manage the granting of permissions to users.
When you wanted to spark-submit a PySpark application, you need to specify the .py file you wanted to run and specify the .egg file or .zip file for dependency libraries. Below are some of the options & configurations specific to PySpark application. besides these you can also use most of the options & configs that are covered above.
Spark Submit Configurations Spark submit supports several configurations using --config, these configurations are used to specify Application configurations, shuffle parameters, runtime configurations. Most of these configurations are the same for Spark applications written in Java, Scala, and Python (PySpark)
You can set the following environment variable which will be used automatically:
export HADOOP_USER_NAME=<your hdfs user>
Also mentioned here:
HADOOP_USER_NAME
This the Hadoop environment variable which propagates the identity of a user in an insecure cluster
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With