Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get number of busy CPUs in Python

I am writing a multiprocessing routine to run on a server with plenty of CPUs. However, the server has multiple users and its usage may vary. So I would like to adapt the number of processors being used according to the current load.

  • Is there a way to estimate the amount of CPUs currently busy in Python? I only found multiprocessing.cpu_count()
  • Bonus question: Is it possible to change multiprocessing.Pool(processes=no_cpus) during activity, in case the load on the server has changed after a while?
like image 259
n1000 Avatar asked Jul 14 '14 09:07

n1000


People also ask

How do I count CPU in Python?

cpu_count() method in Python is used to get the number of CPUs in the system. This method returns None if number of CPUs in the system is undetermined. Parameter: No parameter is required. Return Type: This method returns an integer value which denotes the number of CPUs in the system.

How many CPUs does Python use?

In Python, single-CPU use is caused by the global interpreter lock (GIL), which allows only one thread to carry the Python interpreter at any given time. The GIL was implemented to handle a memory management issue, but as a result, Python is limited to using a single processor.

Does Python use multiple CPUs?

Multiprocessing in Python enables the computer to utilize multiple cores of a CPU to run tasks/processes in parallel. Multiprocessing enables the computer to utilize multiple cores of a CPU to run tasks/processes in parallel.

What is the CPU core count?

On a Computer CI form, the "CPU count" means the number of physical CPUs (sockets). The "CPU core count" means the number of cores in one physical CPU (socket). When we are looking at the CPUs from the Task Manager/vSphere/lscpu, it doesn't actually show the number of physical CPUs.


2 Answers

There are a number of complications...

  • you can't determine which CPUs are busy

Processes (and threads) are scheduled by the Linux kernel on any CPU. Even determining the "current CPU" is awkward -- see How can I see which CPU core a thread is running in?

  • multiprocessing.Pool is designed to start up N workers, which run "forever." Each accepts a task from a queue, does some work, then outputs data. A Pool doesn't change size.

Two suggestions:

  • the uptime command outputs something like this:

19:05:07 up 4 days, 20:43, 3 users, load average:0.99, 1.01, 0.82

The last three numbers are the "load average" over the last minute, five minutes, and 15 minutes. Consider using the first number to load-balance your application.

  • consider having your application do time.sleep(factor) after completing each piece of work.

Thus you can increase the factor when the system is busy (high load average), and make the delay shorter when the system is more idle (low load; ie surfing). Pool stays same size.

like image 108
johntellsall Avatar answered Oct 31 '22 17:10

johntellsall


I followed johntellsall's solution, here's some actual simple schematic. There is one, as Python (for me) confuses virtual with actual cpu. I've decided to calibrate against the average load of the past 15 minutes.

Sleeping numbers are quite arbitrary.

def sleepForMultiCore():
    # divide by 2 since Python does not distinguish physical and virtual core
    cores = 0.5*mp.cpu_count()
    loadAvg = os.getloadavg()[2]

    if loadAvg > cores*1.3:
        sleepTime = 5*60
    elif loadAvg > cores:
        sleepTime = 2*60
    elif loadAvg > cores*0.9:
        sleepTime = 1*60
    else:
        sleepTime = 0
    print ('sleeping for ', sleepTime)
    time.sleep(sleepTime)
like image 27
FooBar Avatar answered Oct 31 '22 16:10

FooBar