Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Append a row to a pair RDD in spark

I have a pair RDD of existing values such as : (1,2) (3,4) (5,6)

I want to append a row (7,8) to the same RDD

How can I append to the same RDD in Spark?

like image 983
user8452799 Avatar asked Aug 31 '25 02:08

user8452799


1 Answers

You can use union operation.

scala> val rdd1 = sc.parallelize(List((1,2), (3,4), (5,6)))
q: org.apache.spark.rdd.RDD[(Int, Int)] = ParallelCollectionRDD[1] at parallelize at <console>:24

scala> val rdd2 = sc.parallelize(List((7, 8)))
q: org.apache.spark.rdd.RDD[(Int, Int)] = ParallelCollectionRDD[1] at parallelize at <console>:24

scala> val unionOfTwo = rdd1.union(rdd2)
res0: org.apache.spark.rdd.RDD[(Int, Int)] = UnionRDD[2] at union at <console>:28
like image 67
Constantine Avatar answered Sep 02 '25 17:09

Constantine