I've done some basic performance and memory consumption benchmarks and I was wondering if there is any way to make things even faster...
I have a giant 70,000 element list with a numpy ndarray, and the file path in a tuple in the said list.
My first version passed a sliced up copy of the list to each of the processes in python multiprocess module, but it would explode ram usage to over 20+ Gigabytes
The second version I moved it into the global space and access it via index such as foo[i] in a loop in each of my processes which seems to put it into a shared memory area/CoW semantics with the processes thus it does not explode the memory usage (Stays at ~3 Gigabytes)
However according to the performance benchmarks/tracing, it seems like the large majority of the application time is now spent in "acquire" mode...
So I was wondering if there is any way i can somehow turn this list into some sort of lockfree/read only so that I can do away with part of the acquire step to help speed up access even more.
Edit 1: Here's the top few line output of the profiling of the app
ncalls tottime percall cumtime percall filename:lineno(function)
65 2450.903 37.706 2450.903 37.706 {built-in method acquire}
39320 0.481 0.000 0.481 0.000 {method 'read' of 'file' objects}
600 0.298 0.000 0.298 0.000 {posix.waitpid}
48 0.271 0.006 0.271 0.006 {posix.fork}
Edit 2: Here's a example of the list structure:
# Sample code for a rough idea of how the list is constructed
sim = []
for root, dirs, files in os.walk(rootdir):
path = os.path.join(root, filename)
image= Image.open(path)
np_array = np.asarray(image)
sim.append( (np_array, path) )
# Roughly it would look something like say this below
sim = List( (np.array([[1, 2, 3], [4, 5, 6]], np.int32), "/foobar/com/what.something") )
Then henceforth the SIM list is to be read only.
The multiprocessing
module provides exactly what you need: a shared array with optional locking, namely the multiprocessing.Array
class. Pass lock=False
to the constructor to disable locking.
Edit (taking into account your update): Things are actually considerably more involved than I initially expected. The data of all elements in your list needs to be created in shared memory. Whether you put the list itself (i.e. the pointers to the actual data) in shared memory, does not matter too much because this should be a small compared to the data of all files. To store the file data in shared memory, use
shared_data = multiprocessing.sharedctypes.RawArray("c", data)
where data
is the data you read from the file. To use this as a NumPy array in one of the processes, use
numpy.frombuffer(shared_data, dtype="c")
which will create a NumPy array view for the shared data. Similarly, to put the path name into shared memory, use
shared_path = multiprocessing.sharedctypes.RawArray("c", path)
where path is an ordinary Python string. In your processes, you can access this as a Python string by using shared_path.raw
. Now append (shared_data, shared_path)
to your list. The list will get copied to the other processes, but the actual data won't.
I initially meant to use an multiprocessing.Array
for the actual list. This would be perfectly possible and would ensure that also the list itself (i.e. the pointers to the data) is in shared memory. Now I think this is not that important at all, as long as the actual data is shared.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With