below is program i ran in spark shell,but when i save output in HDFS i am getting output with compactbuffer.how to remove compactbuffer in spark output.
Program:
val a=sc.textFile("/datagen_10.txt")
val b=a.map(p=>(p.split(",")(1),p.split(2))
val c=sc.textFile("/drug.txt")
val d =c.map(p=>(p.split(",")(1),p.split(",")(0)))
val e=b.cogroup(d)
e.saveAsTextfile("/cogroup")
Output:
(avil,(CompactBuffer(Brandon Buckner, Veda Hopkins, Mara Higgins, Sybill
Crosby, Ivan Hale),CompactBuffer(1)))
(metacin,(CompactBuffer(Len Burgess),CompactBuffer(2)))
(paracetamol,(CompactBuffer(Zia Underwood, Austin Mayer, Tyler Rosales, Alika
Gilmore),CompactBuffer(3)))
You'll have create output strings manually, for example:
e.map{case (k, (xs, ys)) =>
s"""($k, ((${xs.mkString(",")}), (${ys.mkString(",")}))"""}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With