I am writing a multiprocessing
routine to run on a server with plenty of CPUs. However, the server has multiple users and its usage may vary. So I would like to adapt the number of processors being used according to the current load.
multiprocessing.cpu_count()
multiprocessing.Pool(processes=no_cpus)
during activity, in case the load on the server has changed after a while?cpu_count() method in Python is used to get the number of CPUs in the system. This method returns None if number of CPUs in the system is undetermined. Parameter: No parameter is required. Return Type: This method returns an integer value which denotes the number of CPUs in the system.
In Python, single-CPU use is caused by the global interpreter lock (GIL), which allows only one thread to carry the Python interpreter at any given time. The GIL was implemented to handle a memory management issue, but as a result, Python is limited to using a single processor.
Multiprocessing in Python enables the computer to utilize multiple cores of a CPU to run tasks/processes in parallel. Multiprocessing enables the computer to utilize multiple cores of a CPU to run tasks/processes in parallel.
On a Computer CI form, the "CPU count" means the number of physical CPUs (sockets). The "CPU core count" means the number of cores in one physical CPU (socket). When we are looking at the CPUs from the Task Manager/vSphere/lscpu, it doesn't actually show the number of physical CPUs.
There are a number of complications...
Processes (and threads) are scheduled by the Linux kernel on any CPU. Even determining the "current CPU" is awkward -- see How can I see which CPU core a thread is running in?
multiprocessing.Pool
is designed to start up N workers, which run "forever." Each accepts a task from a queue, does some work, then outputs data. A Pool
doesn't change size.Two suggestions:
19:05:07 up 4 days, 20:43, 3 users, load average:
0.99, 1.01, 0.82
The last three numbers are the "load average" over the last minute, five minutes, and 15 minutes. Consider using the first number to load-balance your application.
time.sleep(factor)
after completing each piece of work.Thus you can increase the factor when the system is busy (high load average), and make the delay shorter when the system is more idle (low load; ie surfing). Pool stays same size.
I followed johntellsall's solution, here's some actual simple schematic. There is one, as Python (for me) confuses virtual with actual cpu. I've decided to calibrate against the average load of the past 15 minutes.
Sleeping numbers are quite arbitrary.
def sleepForMultiCore():
# divide by 2 since Python does not distinguish physical and virtual core
cores = 0.5*mp.cpu_count()
loadAvg = os.getloadavg()[2]
if loadAvg > cores*1.3:
sleepTime = 5*60
elif loadAvg > cores:
sleepTime = 2*60
elif loadAvg > cores*0.9:
sleepTime = 1*60
else:
sleepTime = 0
print ('sleeping for ', sleepTime)
time.sleep(sleepTime)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With