Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In spark, what does the parameter "minPartitions" works in SparkContext.textFile(path, minPartitions)?

Tags:

apache-spark

In Spark, either SparkContext or JavaSparkContext, there is one parameter which is minPartitions when you call sc.textFile. what does this parameter imply?

like image 212
EdwinGuo Avatar asked Jul 21 '14 17:07

EdwinGuo


1 Answers

minPartitions will be passed to Hadoop's InputFormat.getSplits. The parameter is a hint, so you may get more or less partitions, depending on the Hadoop InputFormat implementation.

like image 193
Daniel Darabos Avatar answered Oct 30 '22 06:10

Daniel Darabos



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!