Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do new objects in multiprocessing have the same id?

I tried to create a new object in a process when using multiprocessing module. However, something confuses me.

When I use multiprocessing module, the id of the new object is the same

for i in range(4):
    p = multiprocessing.Process(target=worker)
    p.start()

def worker():
    # stanford named entity tagger
    st = StanfordNERTagger(model_path,stanford_ner_path)
    print id(st)    # all the processes print the same id

But when I use threading, they are different:

for i in range(4):
    p = threading.Thread(target=worker)
    p.start()

def worker():
    # stanford named entity tagger
    st = StanfordNERTagger(model_path,stanford_ner_path)
    print id(st)    # threads print differnt ids

I am wondering why they are different.

like image 779
Minwei Shen Avatar asked Nov 12 '15 00:11

Minwei Shen


People also ask

Can two object type have same ID?

Two objects with non-overlapping lifetimes may have the same id() value.

Can the ID of two objects be the same Python?

In Python, every object that is created is given a number that uniquely identifies it. It is guaranteed that no two objects will have the same identifier during any period in which their lifetimes overlap.

What is multiprocessing dummy?

dummy module module provides a wrapper for the multiprocessing module, except implemented using thread-based concurrency. It provides a drop-in replacement for multiprocessing, allowing a program that uses the multiprocessing API to switch to threads with a single change to import statements.

Does Python multiprocessing use shared memory?

multiprocessing is a drop in replacement for Python's multiprocessing module. It supports the exact same operations, but extends it, so that all tensors sent through a multiprocessing. Queue , will have their data moved into shared memory and will only send a handle to another process.


1 Answers

id in CPython returns the pointer of the given object. As threads have shared address space, two different instances of an object will be allocated in two different locations returning two different ids (aka virtual address pointers).

This is not the case for separate processes which own their own address space. By chance, they happen to get the same address pointer.

Keep in mind that address pointers are virtual, therefore they represent an offset within the process address space itself. That's why they are the same.

It is usually better not to rely on id() for distinguishing objects, as new ones might get ids of old ones making hard to track them over time. It usually leads to tricky bugs.

like image 97
noxdafox Avatar answered Sep 28 '22 09:09

noxdafox