Python script execution time increases when executed multiple time parallely

Tags:

4 Answers

1) You really can't ensure the execution will take always the same time, but at least you can avoid performance degradation by using a "locking" strategy like the ones described in this answer.

Basically you can test if the lockfile exists, and if so, put your program to sleep a certain amount of time, then try again.

If the program does not find the lockfile, it creates it, and delete the lockfile at the end of its execution.

Please note: in the below code, when the script fails to get the lock for a certain number of retries, it will exit (but this choice is really up to you).

The following code exemplifies the use of a file as a "lock" against parallel executions of the same script.

import time
import os
import sys

lockfilename = '.lock'
retries = 10
fail = True

for i in range(retries):
    try:
        lock = open(lockfilename, 'r')
        lock.close()
        time.sleep(1)
    except Exception:
        print('Got after {} retries'.format(i))
        fail = False
        lock = open(lockfilename, 'w')
        lock.write('Locked!')
        lock.close()
        break

if fail:
    print("Cannot get the lock, exiting.")
    sys.exit(2)

# program execution...
time.sleep(5)
# end of program execution

os.remove(lockfilename)

2) This would mean that different python instances share the same memory pool and I think it's not feasible.

144

answered Nov 13 '22 06:11

More servers equals more availability

Hearsay tells me that one effective way to ensure consistent request times is to use multiple requests to a cluster. As I heard it the idea goes something like this.

The chance of a slow request

_{(Disclaimer I'm not much of a mathematician or statistician.)}

If there is a 1% chance a request is going to take an abnormal amount of time to finish then one-in-a-hundred requests can be expected to be slow. If you as a client/consumer make two requests to a cluster instead of just one, the chance that both of them turn out to be slow would be more like 1/10000, and with three 1/1000000, et cetera. The downside is doubling your incoming requests means needing to provide (and pay for) as much as twice the server power to fulfill your requests with a consistent time, this additional cost scales with how much chance is acceptable for a slow request.

To my knowledge this concept is optimized for consistent fulfillment times.

The client

A client interfacing with a service like this has to be able to spawn multiple requests and handle them gracefully, probably including closing the unfulfilled connections as soon as it can.

The servers

On the backed there should be a load balancer that can associate multiple incoming client requests to multiple unique cluster workers. If a single client makes multiple requests to an overburdened node, its just going to compound its own request time like you see in your simple example.

In addition to having the client opportunistically close connections it would be best to have a system of sharing job fulfilled status/information so that backlogged request on other other slower-to-process nodes have a chance of aborting an already-fulfilled request.

This this a rather informal answer, I do not have direct experience with optimizing a service application in this manner. If someone does I encourage and welcome more detailed edits and expert implementation opinions.

Caching imports

yes that is a thing, and its awesome!

I would personally recommend setting up django+gunicorn+nginx. Nginx can cache static content and keep a request backlog, gunicorn provides application caching and multiple threads&worker management (not to mention awesome administration and statistic tools), django embeds best practices for database migrations, auth, request routing, as well as off-the-shelf plugins for providing semantic rest endpoints and documentation, all sorts of goodness.

If you really insist on building it from scratch yourself you should study uWsgi, a great Wsgi implementation that can be interfaced with gunicorn to provide application caching. Gunicorn isn't the only option either, Nicholas Piël has a Great write up comparing performance of various python web serving apps.

answered Nov 13 '22 07:11

ThorSummoner

Here's what we have:

EC2 instance type is m3.large box which has only 2 vCPUs https://aws.amazon.com/ec2/instance-types/?nc1=h_ls
We need to run a CPU- and memory-hungry script which takes over a second to execute when CPU is not busy
You're building an API than needs to handle concurrent requests and running apache
From the screenshot I can conclude that:
- your CPUs are 100% utilized when 5 processes are run. Most likely they would be 100% utilized even when fewer processes are run. So this is the bottleneck and no surprise that the more processes are run the more time is required — you CPU resources just get shared among concurrently running scripts.
- each script copy eats about ~300MB of RAM so you have lots of spare RAM and it's not a bottleneck. The amount of free + buffers memory on your screenshot confirms that.
The missing part is:
1. are requests directly sent to your apache server or there's a balancer/proxy in front of it?
2. why do you need PHP in your example? There are plently of solutions available using python ecosystem only without a php wrapper ahead of it

Answers to your questions:

That's infeasible in general case

The most you can do is to track your CPU usage and make sure its idle time doesn't drop below some empirical threshold — in this case your scripts would be run in more or less fixed amount of time.

To guarantee that you need to limit the number of requests being processed concurrently. But if 100 requests are sent to your API concurrently you won't be able to handle them all in parallel! Only some of them will be handled in parallel while others waiting for their turn. But your server won't be knocked down trying to serve them all.

Yes and no

No because unlikely can you do something in your present architecture when a new script is launched on every request through a php wrapper. BTW it's a very expensive operation to run a new script from scratch each time.

Yes if a different solution is used. Here are the options:

use a python-aware pre-forking webserver which will handle your requests directly. You'll spare CPU resources on python startup + you might utilize some preloading technics to share RAM among workers, i.e http://docs.gunicorn.org/en/stable/settings.html#preload-app. You'd also need to limit the number of parallel workers to be run http://docs.gunicorn.org/en/stable/settings.html#workers to adress your first requirement.
if you need PHP for some reason you might setup some intermediary between PHP script and python workers — i.e. a queue-like server. Than simply run several instances of your python scripts which would wait for some request to be availble in the queue. Once it's available it would handle it and put the response back to the queue and php script would slurp it and return back to the client. But it's a more complex to build this that the first solution (if you can eliminate your PHP script of course) and more components would be involved.
reject the idea to handle such heavy requests concurrently, and instead assign each request a unique id, put the request into a queue and return this id to the client immediately. The request will be picked up by an offline handler and put back into the queue once it's finished. It will be client's responsibility to poll your API for readiness of this particular request
1st and 2nd combined — handle requests in PHP and request another HTTP server (or any other TCP server) handling your preloaded .py-scripts

answered Nov 13 '22 07:11

ffeast

The ec2 cloud does not guarantee 7.5gb of free memory on the server. This would mean that the VM performance is severely impacted like you are seeing where the server has less than 7.5gb of physical free ram. Try reducing the amount of memory the server thinks it has.

This form of parallel performance is very expensive. Typically with 300mb requirement, the ideal would be a script which is long running, and re-uses the memory for multiple requests. The Unix fork function allows a shared state to be re-used. The os.fork gives this in python, but may not be compatible with your libraries.

answered Nov 13 '22 06:11

mksteve

Related questions
                            
                                Using `concurrent.futures.Future` as promise
                            
                                Using scipy.optimize.curve_fit with weights
                            
                                Comparison of R, statmodels, sklearn for a classification task with logistic regression
                            
                                parsing age of empires game record files(.mgx)
                            
                                Django - Disable form select field validation
                            
                                Multiple default values specified for column "id" of the table
                            
                                How can I keep python from loading the 'wrong' package?
                            
                                With `pandas.cut()`, how do I get integer bins and avoid getting a negative lowest bound?
                            
                                flask-admin form: Constrain Value of Field 2 depending on Value of Field 1
                            
                                Build a graph that works with variable batch size using Tensorflow
                            
                                Running python/bash code in Rstudio
                            
                                Why does python's timeit use the 'best of 3' to measure the time elapsed?
                            
                                Mean line on top of bar plot with pandas and matplotlib
                            
                                Inconsistent Anaconda root env after failed Python 3.5 update
                            
                                Fixed identifier for a machine (uuid.getnode)
                            
                                numpy boolean comparison between big arrays returns False instead of boolean array
                            
                                Markdown in Spyder IDE
                            
                                How to update OpenSSL on mac?
                            
                                Create an if-else condition column in dask dataframe
                            
                                people.connections.list not returning contacts using Python Client Library

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With