Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

basedir must be absolute: ?/.ivy2/local

I'm writing here in a full desperation state...

I have 2 users:

  • 1 local user, created in Linux. Works 100% fine, word count works perfectly. Kerberized Cluster. Valid ticket.
  • 1 Active Directory user, can login, but pyspark instruction (same word count) fails. Same kdc ticket as the one above.

Exception in thread "main" java.lang.IllegalArgumentException: basedir must be absolute: ?/.ivy2/local at org.apache.ivy.util.Checks.checkAbsolute(Checks.java:48) at org.apache.ivy.plugins.repository.file.FileRepository.setBaseDir(FileRepository.java:135) at org.apache.ivy.plugins.repository.file.FileRepository.(FileRepository.java:44) at org.apache.spark.deploy.SparkSubmitUtils$.createRepoResolvers(SparkSubmit.scala:943) at org.apache.spark.deploy.SparkSubmitUtils$.buildIvySettings(SparkSubmit.scala:1035) at org.apache.spark.deploy.SparkSubmit$$anonfun$2.apply(SparkSubmit.scala:295) at org.apache.spark.deploy.SparkSubmit$$anonfun$2.apply(SparkSubmit.scala:295) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:294) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:153) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

The Code I'm running. Super simple.

import findspark
findspark.init()
from pyspark import SparkConf, SparkContext
conf = SparkConf().setMaster("yarn")
sc = SparkContext(conf=conf)

It ends in error in the last instruction with the above error (see exception).

?/.ivy2/local -> This is the problem but I have no idea what's going on :(.

With the Linux user it works perfectly... but with the AD user that doesn't exists in the local system, but has /home/userFolder ... I have this problem :(

Please help... I've reach the point of insanity... I've googled every corner of the internet but I haven't found any solution to this problem/mistake :( stackoverflow is my last resort heeeeeeeeeelp

like image 397
Joao Barreto Avatar asked Jun 14 '18 15:06

Joao Barreto


Video Answer


1 Answers

Context

Ivy needs a directory called .ivy2, usually located in the home directory. You can also configure where .ivy2 should be by giving a configuration property when Spark starts, or when you execute spark-submit.

Where the problem comes from

In IvySettings.java (line 796 for the version 2.2.0 of ant-ivy) there is this line:

if (getVariable("ivy.home") != null) {
   setDefaultIvyUserDir(Checks.checkAbsolute(getVariable("ivy.home"), "ivy.home"));
   Message.verbose("using ivy.default.ivy.user.dir variable for default ivy user dir: " + defaultUserDir);
} else {
   setDefaultIvyUserDir(new File(System.getProperty("user.home"), ".ivy2"));
   Message.verbose("no default ivy user dir defined: set to " + defaultUserDir);
}

As you can see, if ivy.home is not set, and user.home is also not set, then you will get the error:

Exception in thread "main" java.lang.IllegalArgumentException: basedir must be absolute: ?/.ivy2/local

Solution 1 (spark-shell or spark-submit)

As Rocke Yang has mentioned, you can start spark-shell or spark-submit by setting the configuration property spark.jars.ivy. Example:

spark-shell --conf spark.jars.ivy=/tmp/.ivy

Solution 2 (spark-launcher or yarn-client)

A second solution would be to set the configuration property when calling the submit method programmatically:

sparkLauncher.setSparkHome("/path/to/SPARK_HOME")
  .setAppResource("/path/to/jar/to/be/executed")
  .setMainClass("MainClassName")
  .setMaster("MasterType like yarn or local")
  .setDeployMode("set deploy mode like cluster")
  .setConf("spark.executor.cores","2")
  .setConf("spark.jars.ivy","/tmp/.ivy")

Ticket opened

There is a ticket opened by Spark-Community

like image 94
KeyMaker00 Avatar answered Sep 20 '22 21:09

KeyMaker00