In Spark, either SparkContext or JavaSparkContext, there is one parameter which is minPartitions when you call sc.textFile. what does this parameter imply?
minPartitions
will be passed to Hadoop's InputFormat.getSplits
. The parameter is a hint, so you may get more or less partitions, depending on the Hadoop InputFormat
implementation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With