I have a RDD which is of the form <pre class="prettyprint"><code>org.apache.spark.rdd.RDD[(String, Array[String])] </code></pre> I want to write this into a csv file. Please suggest me how this can be done. Doing myrdd.saveAsTextFile on gives the output as below. <pre class="prettyprint"><code>(875,[Ljava.lang.String;@53620618) (875,[Ljava.lang.String;@487e3c6c) </code></pre>

You can try: <pre class="prettyprint"><code>myrdd.map(a => a._1 + "," + a._2.mkString(",")).saveAsTextFile </code></pre>

Writing a RDD to a csv

Tags:

scala

apache-spark

I have a RDD which is of the form

org.apache.spark.rdd.RDD[(String, Array[String])]

I want to write this into a csv file. Please suggest me how this can be done.

Doing myrdd.saveAsTextFile on gives the output as below.

(875,[Ljava.lang.String;@53620618)
(875,[Ljava.lang.String;@487e3c6c)

982

asked Feb 03 '15 08:02

Kundan Kumar

2 Answers

You can try:

myrdd.map(a => a._1 + "," + a._2.mkString(",")).saveAsTextFile

130

answered Nov 12 '22 06:11

Szymon

The other answer doesn't cater for escaping. Perhaps this more general solution?

import au.com.bytecode.opencsv.CSVWriter
import java.io.StringWriter
import scala.collection.JavaConversions._
val toCsv = (a: Array[String]) => {
  val buf = new StringWriter
  val writer = new CSVWriter(buf)
  writer.writeAll(List(a))
  buf.toString.trim
}
rdd.map(t => Array(t._1) ++ t._2)
   .map(a => toCsv(a))
   .saveAsTextFile(dest)

answered Nov 12 '22 07:11

Alister Lee

Related questions
                            
                                Why aren't the members of my nested companion object automatically visible in the class?
                            
                                Default parameters with currying
                            
                                How to pattern-match Class[X] for different X?
                            
                                How to split this string by regex?
                            
                                How do I parse a x-www-url-encoded string into a Map[String, String] using Lift?
                            
                                Scala type parameter bounds
                            
                                In Scala, is it possible to use implicits to automatically override toString?
                            
                                Scala -- How to use Functors on non-Function types?
                            
                                Computation with time limit
                            
                                Scala: Group an Iterable into an Iterable of Iterables by a predicate
                            
                                Confused by foldLeft error (both in Eclipse and REPL)
                            
                                Is there a way to create dot-free dsl in scala with two identifiers between variables?
                            
                                How to create DSL in Scala for command lines with minimum extra boilerplate
                            
                                Convert from java.util.Vector to a scala Seq
                            
                                Reducing Iterable[Either[A,B]] to Either[A, Iterable[B]]
                            
                                Can reduceLeft be executed in parallel?
                            
                                How to create multiple custom fields in Play Framework?
                            
                                spray-json JsString quotes on string values
                            
                                Generating all possible combinations from a List[List[Int]] in Scala
                            
                                Hbase 0.96 with Spark v 1.0+

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With