I have several large objects (sklearn models) that take up a lot of memory, and I want to share them between several process. is there a way to do this?
You can save and load the model using the pickle operation to serialize your machine learning algorithms and save the serialized format to a file. Hope it helps!
Memory mapping is an alternative approach to file I/O that's available to Python programs through the mmap module. Memory mapping uses lower-level operating system APIs to store file contents directly in physical memory.
Scikit-learn (Sklearn) is the most useful and robust library for machine learning in Python. It provides a selection of efficient tools for machine learning and statistical modeling including classification, regression, clustering and dimensionality reduction via a consistence interface in Python.
Under the proviso that the processes are launched from the same python script, here is an example that creates a second process and shares variables between the two processes. It is straightforward to elaborate on this to create some number of processes. Notice the constructs used to create and access the shared variables and lock. I have inserted a loop over an arithmetic process to generate some cpu usage so that you can monitor and see how this runs on a multi-core or multi-processor platform. Also note the use of a shared variable to control the second process, in this instance to tell it when to exit. And finally, the shared object can be a value or an array, see https://docs.python.org/2/library/multiprocessing.html
#!/usr/bin/python
from time import sleep
from multiprocessing import Process, Value, Lock
def myfunc(counter, lock, run):
while run.value:
sleep(1)
n=0
for i in range(10000):
n = n+i*i
print( n )
with lock:
counter.value += 1
print( "thread %d"%counter.value )
with lock:
counter.value = -1
print( "thread exit %d"%counter.value )
# =======================
counter = Value('i', 0)
run = Value('b', True)
lock = Lock()
p = Process(target=myfunc, args=(counter, lock, run))
p.start()
while counter.value < 5:
print( "main %d"%counter.value )
n=0
for i in range(10000):
n = n+i*i
print( n )
sleep(1)
with lock:
counter.value = 0
while counter.value < 5:
print( "main %d"%counter.value )
sleep(1)
run.value = False
p.join()
print( "main exit %d"%counter.value)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With