For my python application I am thinking of using shelve, part of the standard library. There will be hundreds of processes, each writing something to the same shelve object. The writing will always be to add a new key,value pair to the shelve. The keys are unique, so no two processes will update the same entry.
What could go wrong in such a scenario?
The shelve documentation is explicit about this.
The shelve module does not support concurrent read/write access to shelved objects. (Multiple simultaneous read accesses are safe.) When a program has a shelf open for writing, no other program should have it open for reading or writing. Unix file locking can be used to solve this, but this differs across Unix versions and requires knowledge about the database implementation used.
So, without process synchronisation, I wouldn't do it.
How are the processes started? If they are created by a master process then you can look at the multiprocessing module. Use a Queue to which the child processes write back their results, and have the master remove items from the queue and write them to the shelf. Example of this sort of this is at https://stackoverflow.com/a/24501437/21945.
If you have no process hierarchy then you'll need to use locking to control read and write access to the shelf file. If you are using Linux or similar you might use posix_ipc named semaphore.
The other obvious option is to use a database server - Postgresql or similar.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With