Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark job fails on java 9 NumberFormatException for input string ea

I have a sample spark job which I am successfully able to run on java 8 but when I run same program on java 9 it fails with NumberFormatException

SparkConf conf = new SparkConf();
conf.setMaster("local[*]").setAppName("java 9 example");
SparkSession session = SparkSession.builder().config(conf).getOrCreate();
Dataset<Row> ds = session.read().text("<xyz path>");
System.out.println(ds.count());

Exception Details:

Exception in thread "main" java.lang.NumberFormatException: For input string: "ea" at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.base/java.lang.Integer.parseInt(Integer.java:695) at java.base/java.lang.Integer.parseInt(Integer.java:813) at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:229) at scala.collection.immutable.StringOps.toInt(StringOps.scala:31) at org.apache.spark.SparkContext.warnDeprecatedVersions(SparkContext.scala:353) at org.apache.spark.SparkContext.(SparkContext.scala:186) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860) at com.ts.spark.session.TestApp.main(TestApp.java:18)

Maven spark dependencies:

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_2.10</artifactId>
    <version>2.1.0</version>
</dependency>

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-sql_2.10</artifactId>
    <version>2.1.0</version>
</dependency>

Java Details:

java version "9-ea"
Java(TM) SE Runtime Environment (build 9-ea+156)
Java HotSpot(TM) 64-Bit Server VM (build 9-ea+156, mixed mode)

Are there any additional steps that I need to follow to setup spark on java 9? Thanks!

like image 846
Rahul Sharma Avatar asked Dec 23 '22 14:12

Rahul Sharma


2 Answers

It appears that the scala StringLike is being called to parse "ea" (a portion of "9-ea") as an integer. The JDK 9 build used is old, newer builds dropped "-ea" as the JDK 9 release candidate approached. So get the latest JDK 9 download (jdk-9+181) and also submit a bug to Spark to examine the code that parses the version string. A good reference for the version string scheme is JEP 223 (http://openjdk.java.net/jeps/223).

like image 85
Alan Bateman Avatar answered Dec 28 '22 03:12

Alan Bateman


That Spark does not seem to support non-Int values in version numbers (at least for the supported JVMs), which does not happen for non-early access versions, here comes the ea String. Once Java9 will be properly released, it will run probably correctly.

like image 25
Gábor Bakos Avatar answered Dec 28 '22 02:12

Gábor Bakos