I am new in java8 and spark, where I am trying to execute simple flatmap transformation program in java, But I am facing some error in flatmap transformation in second last line Arrays.asList(e.split(" "))); and error is
Type mismatch: cannot convert from
List<String>toIterator<String>
What is the appropriate solution for this problem. Thanks in advance
import java.util.Arrays;
import java.util.List;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.commons.lang.StringUtils;
public class FlatMapExample {
public static void main(String[] args) throws Exception {
SparkConf sparkConf = new
SparkConf().setMaster("local").setAppName("filter
transformation");
JavaSparkContext sc = new JavaSparkContext(sparkConf);
// Parallelized with 2 partitions
JavaRDD<String> rddX = sc.parallelize(
Arrays.asList("spark rdd example", "sample example"),
2);
// map operation will return List of Array in following case
JavaRDD<String[]> rddY = rddX.map(e -> e.split(" "));
List<String[]> listUsingMap = rddY.collect();
for(int i = 0; i < listUsingMap.size(); i++)
{
System.out.println("list.."+StringUtils.join(listUsingMap.get(i)));
}
//System.out.println("listUsingMap..."+listUsingMap.collect());
// flatMap operation will return list of String in following case
JavaRDD<String> rddY2 = rddX.flatMap(e -> Arrays.asList(e.split(" ")));
List<String> listUsingFlatMap = rddY2.collect();
}
}
You should have specified that you are using at least version 2.0 where FlatMapFunction::call returns actually an Iterator and not Iterable (in 1.6 this is the case for example). Thus, your rddX.flatMap is suppose to return an Iterator<String>, while Arrays.asList(e.split(" ")) returns a List<String>.
But there is List::iterator that you can use, as :
rddX.flatMap(e -> Arrays.asList(e.split(" ")).iterator())
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With