Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to remove compactbuffer in spark output

below is program i ran in spark shell,but when i save output in HDFS i am getting output with compactbuffer.how to remove compactbuffer in spark output.

Program:

val a=sc.textFile("/datagen_10.txt")

val b=a.map(p=>(p.split(",")(1),p.split(2))

val c=sc.textFile("/drug.txt")

val d =c.map(p=>(p.split(",")(1),p.split(",")(0)))

val e=b.cogroup(d)

e.saveAsTextfile("/cogroup")

Output:

(avil,(CompactBuffer(Brandon Buckner, Veda Hopkins, Mara Higgins, Sybill 

Crosby, Ivan Hale),CompactBuffer(1)))

(metacin,(CompactBuffer(Len Burgess),CompactBuffer(2)))

(paracetamol,(CompactBuffer(Zia Underwood, Austin Mayer, Tyler Rosales, Alika 

Gilmore),CompactBuffer(3)))
like image 951
vivman Avatar asked Mar 13 '26 07:03

vivman


1 Answers

You'll have create output strings manually, for example:

e.map{case (k, (xs, ys)) => 
  s"""($k, ((${xs.mkString(",")}), (${ys.mkString(",")}))"""}
like image 200
zero323 Avatar answered Mar 15 '26 23:03

zero323