Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to solve this error "Type mismatch: cannot convert from List<String> to Iterator<String>"

I am new in java8 and spark, where I am trying to execute simple flatmap transformation program in java, But I am facing some error in flatmap transformation in second last line Arrays.asList(e.split(" "))); and error is

Type mismatch: cannot convert from List<String> to Iterator<String>

What is the appropriate solution for this problem. Thanks in advance

 import java.util.Arrays;
 import java.util.List;

 import org.apache.spark.SparkConf;
 import org.apache.spark.api.java.JavaRDD;
 import org.apache.spark.api.java.JavaSparkContext;
 import org.apache.commons.lang.StringUtils;

 public class FlatMapExample {
           public static void main(String[] args) throws Exception {
    SparkConf sparkConf = new 
          SparkConf().setMaster("local").setAppName("filter 
          transformation");
    JavaSparkContext sc = new JavaSparkContext(sparkConf);



    // Parallelized with 2 partitions
    JavaRDD<String> rddX = sc.parallelize(
            Arrays.asList("spark rdd example", "sample example"),
            2);

    // map operation will return List of Array in following case
    JavaRDD<String[]> rddY = rddX.map(e -> e.split(" "));

    List<String[]> listUsingMap = rddY.collect();
    for(int i = 0; i < listUsingMap.size(); i++)
    {
        System.out.println("list.."+StringUtils.join(listUsingMap.get(i)));
    }
    //System.out.println("listUsingMap..."+listUsingMap.collect());

    // flatMap operation will return list of String in following case
    JavaRDD<String> rddY2 = rddX.flatMap(e -> Arrays.asList(e.split(" ")));
    List<String> listUsingFlatMap = rddY2.collect();
}

}

like image 393
Bhagesh Arora Avatar asked Mar 13 '26 23:03

Bhagesh Arora


1 Answers

You should have specified that you are using at least version 2.0 where FlatMapFunction::call returns actually an Iterator and not Iterable (in 1.6 this is the case for example). Thus, your rddX.flatMap is suppose to return an Iterator<String>, while Arrays.asList(e.split(" ")) returns a List<String>.

But there is List::iterator that you can use, as :

 rddX.flatMap(e -> Arrays.asList(e.split(" ")).iterator())
like image 67
Eugene Avatar answered Mar 15 '26 12:03

Eugene



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!