Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What happens if I try to use more cores than I have?

Tags:

apache-spark

In my sparkconf, i can set the number of cores to use, i have 4 physical, 8 logical on my laptop, what does spark do if I specify a number that was not possible on the machine, like say 100 cores?

like image 327
Kristian Avatar asked Jan 20 '16 23:01

Kristian


2 Answers

Number of cores doesn't describe physical cores but a number of running threads. It means that nothing really strange happens if the number is higher than a number of available cores.

Depending on your setup it can be actually a preferred configuration with value around twice a number of available cores being a commonly recommended setting. Obviously if number is to high your application will spend more time on switching between threads than actual processing.

like image 56
zero323 Avatar answered Nov 12 '22 13:11

zero323


It heavily depends on your cluster manager. I assume that you're asking about local[n] run mode.

If so, the driver and the one and only one executor are the same JVM with n number of threads.

DAGScheduler - the Spark execution planner will use n threads to schedule as many tasks as you've told it should.

If you have more tasks, i.e. threads, than cores, your OS will have to deal with more threads than cores and schedule them appropriately.

like image 43
Jacek Laskowski Avatar answered Nov 12 '22 13:11

Jacek Laskowski