I am writing a <code>multiprocessing</code> routine to run on a server with plenty of CPUs. However, the server has multiple users and its usage may vary. So I would like to adapt the number of processors being used according to the current load. <ul> <li>Is there a way to estimate the amount of CPUs currently busy in Python? I only found <code>multiprocessing.cpu_count()</code> </li> <li>Bonus question: Is it possible to change <code>multiprocessing.Pool(processes=no_cpus)</code> during activity, in case the load on the server has changed after a while?</li> </ul>

There are a number of complications... <ul> <li>you can't determine which CPUs are busy</li> </ul> Processes (and threads) are scheduled by the Linux kernel on any CPU. Even determining the "current CPU" is awkward -- see How can I see which CPU core a thread is running in? <ul> <li> <code>multiprocessing.Pool</code> is designed to start up N workers, which run "forever." Each accepts a task from a queue, does some work, then outputs data. A <code>Pool</code> doesn't change size.</li> </ul> Two suggestions: <ul> <li>the uptime command outputs something like this:</li> </ul> <code>19:05:07 up 4 days, 20:43, 3 users, load average:</code>0.99<code>, 1.01, 0.82</code> The last three numbers are the "load average" over the last minute, five minutes, and 15 minutes. Consider using the first number to load-balance your application. <ul> <li>consider having your application do <code>time.sleep(factor)</code> after completing each piece of work.</li> </ul> Thus you can increase the factor when the system is busy (high load average), and make the delay shorter when the system is more idle (low load; ie surfing). Pool stays same size.

I followed johntellsall's solution, here's some actual simple schematic. There is one, as Python (for me) confuses virtual with actual cpu. I've decided to calibrate against the average load of the past 15 minutes. Sleeping numbers are quite arbitrary. <pre class="prettyprint"><code>def sleepForMultiCore(): # divide by 2 since Python does not distinguish physical and virtual core cores = 0.5*mp.cpu_count() loadAvg = os.getloadavg()[2] if loadAvg > cores*1.3: sleepTime = 5*60 elif loadAvg > cores: sleepTime = 2*60 elif loadAvg > cores*0.9: sleepTime = 1*60 else: sleepTime = 0 print ('sleeping for ', sleepTime) time.sleep(sleepTime) </code></pre>

Get number of busy CPUs in Python

Tags:

python

multiprocessing

I am writing a multiprocessing routine to run on a server with plenty of CPUs. However, the server has multiple users and its usage may vary. So I would like to adapt the number of processors being used according to the current load.

Is there a way to estimate the amount of CPUs currently busy in Python? I only found multiprocessing.cpu_count()
Bonus question: Is it possible to change multiprocessing.Pool(processes=no_cpus) during activity, in case the load on the server has changed after a while?

259

asked Jul 14 '14 09:07

n1000

2 Answers

There are a number of complications...

you can't determine which CPUs are busy

Processes (and threads) are scheduled by the Linux kernel on any CPU. Even determining the "current CPU" is awkward -- see How can I see which CPU core a thread is running in?

multiprocessing.Pool is designed to start up N workers, which run "forever." Each accepts a task from a queue, does some work, then outputs data. A Pool doesn't change size.

Two suggestions:

the uptime command outputs something like this:

19:05:07 up 4 days, 20:43, 3 users, load average:0.99, 1.01, 0.82

The last three numbers are the "load average" over the last minute, five minutes, and 15 minutes. Consider using the first number to load-balance your application.

consider having your application do time.sleep(factor) after completing each piece of work.

Thus you can increase the factor when the system is busy (high load average), and make the delay shorter when the system is more idle (low load; ie surfing). Pool stays same size.

108

answered Oct 31 '22 17:10

johntellsall

I followed johntellsall's solution, here's some actual simple schematic. There is one, as Python (for me) confuses virtual with actual cpu. I've decided to calibrate against the average load of the past 15 minutes.

Sleeping numbers are quite arbitrary.

Click to copy

def sleepForMultiCore():
    # divide by 2 since Python does not distinguish physical and virtual core
    cores = 0.5*mp.cpu_count()
    loadAvg = os.getloadavg()[2]

    if loadAvg > cores*1.3:
        sleepTime = 5*60
    elif loadAvg > cores:
        sleepTime = 2*60
    elif loadAvg > cores*0.9:
        sleepTime = 1*60
    else:
        sleepTime = 0
    print ('sleeping for ', sleepTime)
    time.sleep(sleepTime)

answered Oct 31 '22 16:10

FooBar

Related questions
                            
                                Plotting one scatterplot with multiple dataframes with ggplot in python
                            
                                Portable meta class between python2 and python3
                            
                                Python: Regarding variable scope. Why don't I need to pass x to Y?
                            
                                List appears to be empty during sorting [duplicate]
                            
                                How do I slice a numpy array to get both the first and last two rows
                            
                                numpy save 2d array to text file
                            
                                How to get all content posted by a Facebook Group using Graph API
                            
                                python import nested classes shorthand
                            
                                Python, sort a list by another list [duplicate]
                            
                                Python requests Post request data with Django
                            
                                Bind function to Kivy button
                            
                                Using Python To Autofit All Columns of an Excel Sheet
                            
                                Unresolved external symbols building Python C extension
                            
                                how connect to vertica using pyodbc
                            
                                Package a command line application for distribution?
                            
                                pandas: sort each column individually
                            
                                How to compare two timezones in python?
                            
                                Python: Length of longest common subsequence of lists
                            
                                How to use "suggest" in elasticsearch pyes?
                            
                                Sort list of strings and place numbers after letters in python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With