<p>spark <code>foreachPartition</code>, how to get an index of the partition (or sequence number, or something to identify the partition)?</p> <pre class="prettyprint"><code>val docs: RDD[String] = ... println("num partitions: " + docs.getNumPartitions) docs.foreachPartition((it: Iterator[String]) => { println("partition index: " + ???) it.foreach(...) }) </code></pre>

<p>You can use <code>TaskContext</code> (How to get ID of a map task in Spark?):</p> <pre class="prettyprint"><code>import org.apache.spark.TaskContext rdd.foreachPartition((it: Iterator[String]) => { println(TaskContext.getPartitionId) }) </code></pre>

spark foreachPartition, how to get an index of each partition?

Tags:

scala

apache-spark

spark foreachPartition, how to get an index of the partition (or sequence number, or something to identify the partition)?

val docs: RDD[String] = ...

println("num partitions: " + docs.getNumPartitions)

docs.foreachPartition((it: Iterator[String]) => {
  println("partition index: " + ???)
  it.foreach(...)
})

505

asked Jan 22 '18 14:01

David Portabella

1 Answers

You can use TaskContext (How to get ID of a map task in Spark?):

import org.apache.spark.TaskContext

rdd.foreachPartition((it: Iterator[String]) => {
  println(TaskContext.getPartitionId)
})

125

answered Sep 27 '22 21:09

Alper t. Turker

Related questions
                            
                                Implicit parameter resolution - setting the precedence
                            
                                How can I use JUnit ExpectedException in Scala?
                            
                                Scala final variables in constructor
                            
                                What's the difference between fun and fun() in Scala?
                            
                                How to replace a yield with a map in Scala?
                            
                                Why is an anonymous class created when mixing in a trait?
                            
                                What does the symbol "That" mean in Scala
                            
                                How can I escape special symbols in scala string?
                            
                                Pattern matching or isInstanceOf in Scala
                            
                                Scala - override a class method in a trait
                            
                                Scala - Iterate Over Two Arrays
                            
                                org.apache.spark.SparkException: Task not serializable
                            
                                NoClassDefFound : Scala/xml/metadata
                            
                                Convert RDD to Dataframe in Spark/Scala
                            
                                Extracting Path Head in Akka Directives
                            
                                Akka Stream - Timer or Scheduler like CRON
                            
                                scala: assign null to primitive
                            
                                Explicit cast reading .csv with case class Spark 2.1.0
                            
                                Scala objects and thread safety
                            
                                spark - scala - save dataframe to a table with overwrite mode

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With