In my mapPartition part, there are multi-threading works to do, I use thread pool and want to run a task in parallel. But I cannot distinguish these two parameters. I guess I can set --executor-cores to 5, and I run 4 threads in my task. Is this right?
spark.task.cpus
is the number of cores to allocate for each task and --executor-cores
specify Number of cores per executor.
There is small difference between executor and tasks as explained here.
For knowing how many threads you can run per core go through this post.
As per the links :
When you create the SparkContext, each worker starts an executor. This is a separate process (JVM). The executors connect back to your driver program. Now the driver can send them commands, like flatMap, map and reduceByKey, these commands are tasks.
For knowing number of threads your cpu supports per core run lscpu
and check value of Thread(s) per core:
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With