I'm currently learning Python (from a Java background), and I have a question about something I would have used threads for in Java.
My program will use workers to read from some web-service some data periodically. Each worker will call on the web-service at various times periodically.
From what I have read, it's preferable to use the multiprocessing
module and set up the workers as independent processes that get on with their data-gathering tasks. On Java I would have done something conceptually similar, but using threads. While it appears I can use threads in Python, I'll lose out on multi-cpu utilisation.
Here's the guts of my question: The web-service is throttled, viz., the workers must not call on it more than x times per second. What is the best way for the workers to check on whether they may request data?
I'm confused as to whether this should be achieved using:
nmap
, to share some data/value between the processes that describes if they may call the web-service.Manager()
object that monitors the calls per seconds and informs workers if they have permission to make their calls.Of course, I guess this may come down to how I keep track of the calls per second. I suppose one option would be for the workers to call a function on some other object, which makes the call to the web-service and records the current number of calls/sec. Another option would be for the function that calls the web-service to live within each worker, and for them to message a managing object every time they make a call to the web-service.
Thoughts welcome!
A queue is a data structure on which items can be added by a call to put() and from which items can be retrieved by a call to get(). The multiprocessing. Queue provides a first-in, first-out FIFO queue, which means that the items are retrieved from the queue in the order they were added.
Passing Messages to Processes A simple way to communicate between process with multiprocessing is to use a Queue to pass messages back and forth. Any pickle-able object can pass through a Queue. This short example only passes a single message to a single worker, then the main process waits for the worker to finish.
Multiprocessing Manager provides a way of creating centralized Python objects that can be shared safely among processes. Managers provide a way to create data which can be shared between different processes, including sharing over a network between processes running on different machines.
In Python, a process is an instance of the Python interpreter that executes Python code. In Python, the first process created when we run our program is called the 'MainProcess'. It is also a parent process and may be called the main parent process. The main process will create the first child process or processes.
Delegate the retrieval to a separate process which queues the requests until it is their turn.
I think that you'll find that the multiprocessing
module will provide you with some fairly familiar constructs.
You might find that multiprocessing.Queue
is useful for connecting your worker threads back to a managing thread that could provide monitoring or throttling.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With