Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert Scala expression to Java 1.8

I'm trying to convert this Scala expression to Java:

val corpus: RDD[String] = sc.wholeTextFiles("docs/*.md").map(_._2)

This is what I have in Java:

RDD<String> corpus = sc.wholeTextFiles("docs/*.md").map(a -> a._2);

But I get an error on a._2:.

Bad return type in lambda expression: String cannot be converted to R

If I go to the "super" method, this is what I see:

package org.apache.spark.api.java.function;

import java.io.Serializable;

public interface Function<T1, R> extends Serializable {
        R call(T1 var1) throws Exception;
}
like image 958
neuromouse Avatar asked Mar 05 '16 09:03

neuromouse


2 Answers

In Scala PairRDD is a Tuple type and you can access its members with _1and _2. However Java does not have built in Tuples so you have to use methods to get these members. It should look like this, since Java always requires parentheses on any function.

JavaRDD<String> corpus = sc.wholeTextFiles("docs/*.md").map(a -> a._2());

Edit: It seems that in Scala an implicit parameter is passed to the map method, which means you have to pass it explicitly in Java. See here for the Java Doc and here for the Scala documentation.

Edit 2: After a few hours of fumbling the answer was found, it had to be a JavaRDD.

like image 101
Luka Jacobowitz Avatar answered Nov 15 '22 05:11

Luka Jacobowitz


You should be able to use values() to get the result you want in Java here:

JavaRDD<String> corpus = sc.wholeTextFiles("docs/*.md").values();

Note that the type here is JavaRDD not RDD

like image 2
Steve Willcock Avatar answered Nov 15 '22 05:11

Steve Willcock