Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to change user in hdfs using sparkSubmit in java

I would like to change the user used in hdfs compared to the one that is used in the jvm because i have this error :

Stream spark: org.apache.hadoop.security.AccessControlException: Permission denied: user=www, access=WRITE, node="/user/www/.sparkStaging/application_1460635834146_0012":hdfs:hdfs:drwxr-xr-x

And i want to change the user "www" to another like "joe" who had the permission to write. ( i haven't a folder "user/www", but i have "user/joe" )

Here is my java code :

    LOGGER.debug("start submitSparkJob");

    Process spark;
    SparkLauncher sl;
    try {
        sl = new SparkLauncher()
                .setAppName(argsMap.get(SparkParametersEnum.NAME))
                .setSparkHome(argsMap.get(SparkParametersEnum.SPARK_HOME))
                .setAppResource(argsMap.get(SparkParametersEnum.JAR))
                .setMainClass(argsMap.get(SparkParametersEnum.CLASS))
                .addAppArgs(argsMap.get(SparkParametersEnum.ARG))
                .setMaster(argsMap.get(SparkParametersEnum.MASTER))
                .setDeployMode(argsMap.get(SparkParametersEnum.DEPLOY_MODE))
                .setConf(SparkLauncher.DRIVER_MEMORY, "2g")
                .setVerbose(true);


            if(argsMap.containsKey(SparkParametersEnum.STAGING_DIR)){
                sl.setConf("spark.yarn.stagingDir", argsMap.get(SparkParametersEnum.STAGING_DIR));
            }
            if(argsMap.containsKey(SparkParametersEnum.ACCESS_NAMENODES)){
                sl.setConf("spark.yarn.access.namenodes", argsMap.get(SparkParametersEnum.ACCESS_NAMENODES));
            }
            if(argsMap.containsKey(SparkParametersEnum.PRINCIPAL)){
                sl.setConf("spark.yarn.principal", argsMap.get(SparkParametersEnum.PRINCIPAL));
            }
            if(argsMap.containsKey(SparkParametersEnum.DIST_JAR)){
                sl.setConf("spark.yarn.dist.jars", argsMap.get(SparkParametersEnum.DIST_JAR));
            }
       LOGGER.debug("SparkLauncher set");

        spark = sl.launch();

        LOGGER.debug("SparkLauncher launched");

I tried :

  • Set the user with Systeme.setProterty("user.name","joe");
  • Change the option spark.yarn.stagingDir
  • Change the option spark.yarn.access.namenodes
  • Change the option spark.yarn.dist.jars

But none worked

Here you can see the strak trace:

15 Feb 2017 15:36:22,794  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: Parsed arguments:
15 Feb 2017 15:36:22,794  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   master                  yarn//*****
15 Feb 2017 15:36:22,795  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   deployMode              cluster
15 Feb 2017 15:36:22,795  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   executorMemory          null
15 Feb 2017 15:36:22,795  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   executorCores           null
15 Feb 2017 15:36:22,795  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   totalExecutorCores      null
15 Feb 2017 15:36:22,795  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   propertiesFile          /usr/hdp/2.3.0.0-2557/spark/conf/spark-defaults.conf
15 Feb 2017 15:36:22,796  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   driverMemory            2g
15 Feb 2017 15:36:22,796  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   driverCores             null
15 Feb 2017 15:36:22,796  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   driverExtraClassPath    null
15 Feb 2017 15:36:22,796  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   driverExtraLibraryPath  null
15 Feb 2017 15:36:22,796  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   driverExtraJavaOptions  null
15 Feb 2017 15:36:22,796  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   supervise               false
15 Feb 2017 15:36:22,797  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   queue                   null
15 Feb 2017 15:36:22,797  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   numExecutors            null
15 Feb 2017 15:36:22,797  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   files                   null
15 Feb 2017 15:36:22,797  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   pyFiles                 null
15 Feb 2017 15:36:22,797  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   archives                null
15 Feb 2017 15:36:22,797  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   mainClass               **********.ExtractLauncher
15 Feb 2017 15:36:22,798  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   primaryResource         file:/usr/*****/MyJar.jar
15 Feb 2017 15:36:22,798  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   name                    mySparkApp
15 Feb 2017 15:36:22,798  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   childArgs               [application-context.xml -s "2017-02-08" -e "2017-02-08" -t "******" -te "*****"]
15 Feb 2017 15:36:22,798  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   jars                    null
15 Feb 2017 15:36:22,798  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   packages                null
15 Feb 2017 15:36:22,798  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   repositories            null
15 Feb 2017 15:36:22,799  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   verbose                 true
15 Feb 2017 15:36:22,799  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:
15 Feb 2017 15:36:22,799  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: Spark properties used, including those specified through
15 Feb 2017 15:36:22,800  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:  --conf and those from the properties file /usr/hdp/2.3.0.0-2557/spark/conf/spark-defaults.conf:
15 Feb 2017 15:36:22,800  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   spark.yarn.queue -> default
15 Feb 2017 15:36:22,801  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   spark.local.dir -> /hadoop/spark
15 Feb 2017 15:36:22,801  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   spark.history.kerberos.principal -> none
15 Feb 2017 15:36:22,802  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   spark.driver.memory -> 2g
15 Feb 2017 15:36:22,802  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   spark.yarn.max.executor.failures -> 3
15 Feb 2017 15:36:22,802  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   spark.yarn.historyServer.address -> ********:*****
15 Feb 2017 15:36:22,803  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   spark.yarn.services -> org.apache.spark.deploy.yarn.history.YarnHistoryService
15 Feb 2017 15:36:22,803  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   spark.history.ui.port -> *****
15 Feb 2017 15:36:22,804  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   spark.history.provider -> org.apache.spark.deploy.yarn.history.YarnHistoryProvider
15 Feb 2017 15:36:22,804  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   spark.yarn.scheduler.heartbeat.interval-ms -> 5000
15 Feb 2017 15:36:22,805  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   spark.yarn.submit.file.replication -> 3
15 Feb 2017 15:36:22,805  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   spark.yarn.executor.memoryOverhead -> 384
15 Feb 2017 15:36:22,805  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   spark.yarn.containerLauncherMaxThreads -> 25
15 Feb 2017 15:36:22,806  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   spark.yarn.driver.memoryOverhead -> 384
15 Feb 2017 15:36:22,806  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   spark.history.kerberos.keytab -> none
15 Feb 2017 15:36:22,807  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:   spark.yarn.preserve.staging.files -> false
15 Feb 2017 15:36:22,807  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:
15 Feb 2017 15:36:22,808  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:
15 Feb 2017 15:36:22,814  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: Main class:
15 Feb 2017 15:36:22,814  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: org.apache.spark.deploy.yarn.Client
15 Feb 2017 15:36:22,815  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: Arguments:
15 Feb 2017 15:36:22,815  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: --name
15 Feb 2017 15:36:22,815  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: mySparkApp
15 Feb 2017 15:36:22,815  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: --driver-memory
15 Feb 2017 15:36:22,815  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: 2g
15 Feb 2017 15:36:22,815  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: --jar
15 Feb 2017 15:36:22,816  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: file:/usr/***/MyJar.jar
15 Feb 2017 15:36:22,816  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: --class
15 Feb 2017 15:36:22,816  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: **********.ExtractLauncher
15 Feb 2017 15:36:22,816  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: --arg
15 Feb 2017 15:36:22,816  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: application-context.xml -s "2017-02-08" -e "2017-02-08" -t "******" -te "******"
15 Feb 2017 15:36:22,817  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: System properties:
15 Feb 2017 15:36:22,817  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.queue -> default
15 Feb 2017 15:36:22,817  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.local.dir -> /hadoop/spark
15 Feb 2017 15:36:22,817  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.history.kerberos.principal -> none
15 Feb 2017 15:36:22,817  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.driver.memory -> 2g
15 Feb 2017 15:36:22,818  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.max.executor.failures -> 3
15 Feb 2017 15:36:22,818  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.historyServer.address -> ******:*****
15 Feb 2017 15:36:22,818  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.services -> org.apache.spark.deploy.yarn.history.YarnHistoryService
15 Feb 2017 15:36:22,818  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.history.ui.port -> *****
15 Feb 2017 15:36:22,818  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: SPARK_SUBMIT -> true
15 Feb 2017 15:36:22,818  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.history.provider -> org.apache.spark.deploy.yarn.history.YarnHistoryProvider
15 Feb 2017 15:36:22,818  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.app.name -> mySparkApp
15 Feb 2017 15:36:22,819  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.executor.memoryOverhead -> 384
15 Feb 2017 15:36:22,819  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.submit.file.replication -> 3
15 Feb 2017 15:36:22,819  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.scheduler.heartbeat.interval-ms -> 5000
15 Feb 2017 15:36:22,819  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.driver.memoryOverhead -> 384
15 Feb 2017 15:36:22,819  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.containerLauncherMaxThreads -> 25
15 Feb 2017 15:36:22,820  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.history.kerberos.keytab -> none
15 Feb 2017 15:36:22,820  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.yarn.preserve.staging.files -> false
15 Feb 2017 15:36:22,821  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: spark.master -> yarn-cluster
15 Feb 2017 15:36:22,821  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: Classpath elements:
15 Feb 2017 15:36:22,821  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:
15 Feb 2017 15:36:22,821  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:
15 Feb 2017 15:36:22,821  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:
15 Feb 2017 15:36:23,275  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: 17/02/15 15:36:23 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15 Feb 2017 15:36:23,796  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: 17/02/15 15:36:23 INFO RMProxy: Connecting to ResourceManager at *********:*******
15 Feb 2017 15:36:24,030  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: 17/02/15 15:36:24 INFO Client: Requesting a new application from cluster with 1 NodeManagers
15 Feb 2017 15:36:24,043  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: 17/02/15 15:36:24 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (5120 MB per container)
15 Feb 2017 15:36:24,044  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: 17/02/15 15:36:24 INFO Client: Will allocate AM container, with 2432 MB memory including 384 MB overhead
15 Feb 2017 15:36:24,045  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: 17/02/15 15:36:24 INFO Client: Setting up container launch context for our AM
15 Feb 2017 15:36:24,046  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: 17/02/15 15:36:24 INFO Client: Preparing resources for our AM container
15 Feb 2017 15:36:24,364  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: 17/02/15 15:36:24 WARN DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
15 Feb 2017 15:36:24,402  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: Error: application failed with exception
15 Feb 2017 15:36:24,402  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: org.apache.hadoop.security.AccessControlException: Permission denied: user=www, access=WRITE, inode="/user/www/.sparkStaging/application_1460635834146_0012":hdfs:hdfs:drwxr-xr-x
15 Feb 2017 15:36:24,402  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)
15 Feb 2017 15:36:24,402  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292)
15 Feb 2017 15:36:24,403  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:213)
15 Feb 2017 15:36:24,403  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
15 Feb 2017 15:36:24,403  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1698)
15 Feb 2017 15:36:24,403  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1682)
15 Feb 2017 15:36:24,403  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1665)
15 Feb 2017 15:36:24,403  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:71)
15 Feb 2017 15:36:24,404  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3895)
15 Feb 2017 15:36:24,404  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:983)
15 Feb 2017 15:36:24,404  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:622)
15 Feb 2017 15:36:24,404  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
15 Feb 2017 15:36:24,404  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
15 Feb 2017 15:36:24,404  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
15 Feb 2017 15:36:24,404  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2081)
15 Feb 2017 15:36:24,405  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2077)
15 Feb 2017 15:36:24,410  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at java.security.AccessController.doPrivileged(Native Method)
15 Feb 2017 15:36:24,410  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at javax.security.auth.Subject.doAs(Subject.java:422)
15 Feb 2017 15:36:24,410  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
15 Feb 2017 15:36:24,411  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2075)
15 Feb 2017 15:36:24,411  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:
15 Feb 2017 15:36:24,414  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
15 Feb 2017 15:36:24,414  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
15 Feb 2017 15:36:24,414  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
15 Feb 2017 15:36:24,414  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
15 Feb 2017 15:36:24,414  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
15 Feb 2017 15:36:24,415  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
15 Feb 2017 15:36:24,415  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:3010)
15 Feb 2017 15:36:24,415  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2978)
15 Feb 2017 15:36:24,415  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1047)
15 Feb 2017 15:36:24,415  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1043)
15 Feb 2017 15:36:24,415  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
15 Feb 2017 15:36:24,416  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:1043)
15 Feb 2017 15:36:24,416  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:1036)
15 Feb 2017 15:36:24,416  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1877)
15 Feb 2017 15:36:24,416  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:598)
15 Feb 2017 15:36:24,416  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:224)
15 Feb 2017 15:36:24,416  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:384)
15 Feb 2017 15:36:24,416  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:102)
15 Feb 2017 15:36:24,417  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.spark.deploy.yarn.Client.run(Client.scala:619)
15 Feb 2017 15:36:24,417  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.spark.deploy.yarn.Client$.main(Client.scala:647)
15 Feb 2017 15:36:24,417  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.spark.deploy.yarn.Client.main(Client.scala)
15 Feb 2017 15:36:24,417  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
15 Feb 2017 15:36:24,417  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
15 Feb 2017 15:36:24,417  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
15 Feb 2017 15:36:24,417  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at java.lang.reflect.Method.invoke(Method.java:497)
15 Feb 2017 15:36:24,421  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:577)
15 Feb 2017 15:36:24,421  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:174)
15 Feb 2017 15:36:24,421  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:197)
15 Feb 2017 15:36:24,422  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
15 Feb 2017 15:36:24,422  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
15 Feb 2017 15:36:24,422  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark: Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=www, access=WRITE, inode="/user/www/.sparkStaging/application_1460635834146_0012":hdfs:hdfs:drwxr-xr-x
15 Feb 2017 15:36:24,422  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)
15 Feb 2017 15:36:24,422  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292)
15 Feb 2017 15:36:24,422  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:213)
15 Feb 2017 15:36:24,422  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
15 Feb 2017 15:36:24,422  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1698)
15 Feb 2017 15:36:24,423  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1682)
15 Feb 2017 15:36:24,432  [DEBUG] (InputStreamReaderRunnable.java:run:32): Stream spark:       ... 33 more

If someone has an idea :) Thanks !

like image 817
hartar Avatar asked Feb 15 '17 15:02

hartar


People also ask

What is the use of spark-submit in Apache Spark?

Apache Spark / PySpark The spark-submit command is a utility to run or submit a Spark or PySpark application program (or job) to the cluster by specifying options and configurations, the application you are submitting can be written in Scala, Java, or Python (PySpark). spark-submit command supports the following.

What is the HDFS super user?

HDFS as a file system is somewhat similar to the POSIX file system in terms of the file permissions it requires. However, HDFS doesn’t have the concept of users and groups as in the other file systems. It’s important to understand the nature of the HDFS super user and how to manage the granting of permissions to users.

How to use spark-submit a pyspark application?

When you wanted to spark-submit a PySpark application, you need to specify the .py file you wanted to run and specify the .egg file or .zip file for dependency libraries. Below are some of the options & configurations specific to PySpark application. besides these you can also use most of the options & configs that are covered above.

What is--config used for in spark submit?

Spark Submit Configurations Spark submit supports several configurations using --config, these configurations are used to specify Application configurations, shuffle parameters, runtime configurations. Most of these configurations are the same for Spark applications written in Java, Scala, and Python (PySpark)


1 Answers

You can set the following environment variable which will be used automatically:

export HADOOP_USER_NAME=<your hdfs user>

Also mentioned here:

HADOOP_USER_NAME
This the Hadoop environment variable which propagates the identity of a user in an insecure cluster

like image 148
Yaron Avatar answered Oct 04 '22 12:10

Yaron