if i write <pre class="prettyprint"><code>dataFrame.write.format("parquet").mode("append").save("temp.parquet") </code></pre> in temp.parquet folder i got the same file numbers as the row numbers i think i'm not fully understand about parquet but is it natural?

Use <code>coalesce</code> before write operation <code>dataFrame.coalesce(1).write.format("parquet").mode("append").save("temp.parquet")</code> <hr> EDIT-1 Upon a closer look, the docs do warn about <code>coalesce</code> <blockquote> However, if you're doing a drastic coalesce, e.g. to numPartitions = 1, this may result in your computation taking place on fewer nodes than you like (e.g. one node in the case of numPartitions = 1) </blockquote> Therefore as suggested by @Amar, it's better to use <code>repartition</code>

Spark save(write) parquet only one file

dataFrame.write.format("parquet").mode("append").save("temp.parquet")

in temp.parquet folder i got the same file numbers as the row numbers

i think i'm not fully understand about parquet but is it natural?

673

asked Aug 01 '18 08:08

Easyhyum

1 Answers

Use coalesce before write operation

dataFrame.coalesce(1).write.format("parquet").mode("append").save("temp.parquet")

EDIT-1

Upon a closer look, the docs do warn about coalesce

However, if you're doing a drastic coalesce, e.g. to numPartitions = 1, this may result in your computation taking place on fewer nodes than you like (e.g. one node in the case of numPartitions = 1)

Therefore as suggested by @Amar, it's better to use repartition

193

answered Sep 17 '22 14:09

y2k-shubham

Related questions
                            
                                Does C# have an equivalent to Scala's structural typing?
                            
                                Other programming languages that support implicits "a la Scala"
                            
                                What really happens behind the Scala runtime/REPL when running a '.scala' program?
                            
                                Scala Collection sorted, sortWith and sortBy Performance
                            
                                How to use Scala varargs from Java code
                            
                                "Dynamic" method invocation with new Scala reflection API
                            
                                scala - error: not found: value
                            
                                Trailing comma in a type
                            
                                Where are the Spark logs on EMR?
                            
                                How to flatten a sequence of cats' ValidatedNel values
                            
                                How to search for methods in Scaladoc, globally?
                            
                                Partition a collection into "k" close-to-equal pieces (Scala, but language agnostic)
                            
                                How do I print an expanded macro in Scala?
                            
                                How do you share classes between test configurations using SBT
                            
                                covariant type T occurs in contravariant position
                            
                                Scala REPL startup error "class file is broken" [duplicate]
                            
                                How print all actors in akka system?
                            
                                Scala, currying and overloading
                            
                                Is it possible to override a type field?
                            
                                Any Scala SDK or interface for AWS?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Spark save(write) parquet only one file

Tags:

scala

apache-spark

parquet

Easyhyum

People also ask

1 Answers

y2k-shubham

Recent Activity

Donate For Us