Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark 2.0.0 Arrays.asList not working - incompatible types

Following code works with Spark 1.5.2 but not with Spark 2.0.0. I am using Java 1.8.

final SparkConf sparkConf = new SparkConf();
sparkConf.setMaster("local[4]"); // Four threads
final JavaSparkContext javaSparkContext = new JavaSparkContext(sparkConf);
final JavaRDD<String> javaRDDLines = javaSparkContext.textFile("4300.txt");
final JavaRDD<String> javaRDDWords = javaRDDLines.flatMap(line -> Arrays.asList(line.split(" ")));

I get following error

Error:(46, 66) java: incompatible types: no instance(s) of type variable(s) T exist so that java.util.List<T> conforms to java.util.Iterator<U>

I am unable to figure out if the Spark API has changed or something else. Please help. Thanks.

like image 536
Vinay Avatar asked Aug 10 '16 18:08

Vinay


1 Answers

In 2.0, FlatMapFunction.call() returns an Iterator rather than Iterable. Try this:

JavaRDD<String> javaRDDWords = javaRDDLines.flatMap(line -> Arrays.asList(line.split(" ")).iterator())
like image 103
shmosel Avatar answered Oct 13 '22 20:10

shmosel