This code works fine on Linux, but fails under Windows (which is expected). I know that the multiprocessing module uses fork()
to spawn a new process and the file descriptors owned by the parent (i.e. the opened socket) are therefore inherited by the child. However, it was my understanding that the only type of data you can send via multiprocessing needs to be pickleable. On Windows and Linux, the socket object is not pickleable.
from socket import socket, AF_INET, SOCK_STREAM
import multiprocessing as mp
import pickle
sock = socket(AF_INET, SOCK_STREAM)
sock.connect(("www.python.org", 80))
sock.sendall(b"GET / HTTP/1.1\r\nHost: www.python.org\r\n\r\n")
try:
pickle.dumps(sock)
except TypeError:
print("sock is not pickleable")
def foo(obj):
print("Received: {}".format(type(obj)))
data, done = [], False
while not done:
tmp = obj.recv(1024)
done = len(tmp) < 1024
data.append(tmp)
data = b"".join(data)
print(data.decode())
proc = mp.Process(target=foo, args=(sock,))
proc.start()
proc.join()
My question is why can a socket
object, a demonstrably non-pickleable object, be passed in with multiprocessing? Does it not use pickle as Windows does?
On unix platforms sockets and other file descriptors can be sent to a different process using unix domain (AF_UNIX) sockets, so sockets can be pickled in the context of multiprocessing.
The multiprocessing module uses a special pickler instance instead of a regular pickler, ForkingPickler, to pickle sockets and file descriptors which then can be unpickled in a different process. It's only possible to do this because it is known where the pickled instance will be unpickled, it wouldn't make sense to pickle a socket or file descriptor and send it between machine boundaries.
For windows there are similar mechanisms for open file handles.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With