I am trying to traverse a Dataset to do some string similarity calculations like Jaro winkler or Cosine Similarity. I convert my Dataset to list of rows and then traverse with for statement which is not efficient spark way to do it. So I am looking forward for a better approach in Spark.
public class sample {
public static void main(String[] args) {
JavaSparkContext sc = new JavaSparkContext(new SparkConf().setAppName("Example").setMaster("local[*]"));
SQLContext sqlContext = new SQLContext(sc);
SparkSession spark = SparkSession.builder().appName("JavaTokenizerExample").getOrCreate();
List<Row> data = Arrays.asList(RowFactory.create("Mysore","Mysuru"),
RowFactory.create("Name","FirstName"));
StructType schema = new StructType(
new StructField[] { new StructField("Word1", DataTypes.StringType, true, Metadata.empty()),
new StructField("Word2", DataTypes.StringType, true, Metadata.empty()) });
Dataset<Row> oldDF = spark.createDataFrame(data, schema);
oldDF.show();
List<Row> rowslist = oldDF.collectAsList();
}
}
I have found many JavaRDD examples which I am not clear. An Example for Dataset will help me a lot.
In Spark, foreach() is an action operation that is available in RDD, DataFrame, and Dataset to iterate/loop over each element in the dataset, It is similar to for with advance concepts.
foreach() operation is an action. It does not return any value. It executes input function on each element of an RDD. It executes the function on each item in RDD.
You can use org.apache.spark.api.java.function.ForeachFunction
like below.
oldDF.foreach((ForeachFunction<Row>) row -> System.out.println(row));
For old java jdks that don't support lambda expressions, you can use the following after importing:
import org.apache.spark.api.java.function.VoidFunction;
yourDataSet.toJavaRDD().foreach(new VoidFunction<Row>() {
public void call(Row r) throws Exception {
System.out.println(r.getAs("your column name here"));
}
});
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With