Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert Dataset into JavaPairRDD?

There are methods to convert Dataset to JavaRDD .

Dataset<Row> dataFrame;
JavaRDD<String> data = dataFrame.toJavaRDD();

Is there any other ways to convert Dataset into javaPairRDD<Long, Vector>?

like image 779
Manikandan Balasubramanian Avatar asked May 02 '17 06:05

Manikandan Balasubramanian


1 Answers

You can use PairFunction like below. Please check the index of element in your Dataset. In below sample index 0 has long value and index 3 has Vector.

JavaPairRDD<Long, Vector> jpRDD = dataFrame.toJavaRDD().mapToPair(new PairFunction<Row, Long, Vector>() {
    public Tuple2<Long, Vector> call(Row row) throws Exception {
        return new Tuple2<Long, Vector>((Long) row.get(0), (Vector) row.get(3));
    }
});
like image 175
abaghel Avatar answered Sep 21 '22 12:09

abaghel