I tried to create a new object in a process when using multiprocessing module. However, something confuses me.
When I use multiprocessing module, the id of the new object is the same
for i in range(4):
p = multiprocessing.Process(target=worker)
p.start()
def worker():
# stanford named entity tagger
st = StanfordNERTagger(model_path,stanford_ner_path)
print id(st) # all the processes print the same id
But when I use threading, they are different:
for i in range(4):
p = threading.Thread(target=worker)
p.start()
def worker():
# stanford named entity tagger
st = StanfordNERTagger(model_path,stanford_ner_path)
print id(st) # threads print differnt ids
I am wondering why they are different.
Two objects with non-overlapping lifetimes may have the same id() value.
In Python, every object that is created is given a number that uniquely identifies it. It is guaranteed that no two objects will have the same identifier during any period in which their lifetimes overlap.
dummy module module provides a wrapper for the multiprocessing module, except implemented using thread-based concurrency. It provides a drop-in replacement for multiprocessing, allowing a program that uses the multiprocessing API to switch to threads with a single change to import statements.
multiprocessing is a drop in replacement for Python's multiprocessing module. It supports the exact same operations, but extends it, so that all tensors sent through a multiprocessing. Queue , will have their data moved into shared memory and will only send a handle to another process.
id in CPython returns the pointer of the given object. As threads have shared address space, two different instances of an object will be allocated in two different locations returning two different ids (aka virtual address pointers).
This is not the case for separate processes which own their own address space. By chance, they happen to get the same address pointer.
Keep in mind that address pointers are virtual, therefore they represent an offset within the process address space itself. That's why they are the same.
It is usually better not to rely on id() for distinguishing objects, as new ones might get ids of old ones making hard to track them over time. It usually leads to tricky bugs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With