Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does (Py)ZeroMQ open so many Unix socket files?

I tried to monitor the number of open Unix socket files with lsof -U | wc -l while I executed this code:

>>> import zmq
# 1375 Unix socket files
>>> c = zmq.Context()
# 1377 Unix socket files
>>> s = c.socket(zmq.PUSH)
# 1383 Unix socket files
>>> s.close()
# 1381 Unix socket files
>>> c.destroy()
# 1375 Unix socket files

Why is that? I would expect a TCP/IPC socket file being opened when I connected the socket, but what about those files before even connecting?

It seems they are all of type "STREAM":

enter image description here

Update

See @gdlmx's answer for a useful script to reproduce this issue.

It seems if you use Conda to install pyzmq everything works as expected. I, however, am still interested in knowing why it would not work if you install pyzmq with pip, which I would consider the standard way to install the package.

Steps to reproduce:

With Conda:

conda create -n foo python=3.6
conda activate foo
pip install pyzmq
python test_script.py

With Python's venv:

python3.6 -m venv venv
source ./venv/bin/activate
pip install pyzmq
python test_script.py
like image 873
Peque Avatar asked Apr 04 '19 10:04

Peque


People also ask

Does ZeroMQ use sockets?

ZeroMQ patterns are implemented by pairs of sockets with matching types. The built-in core ZeroMQ patterns are: Request-reply, which connects a set of clients to a set of services. This is a remote procedure call and task distribution pattern.

Why ZeroMQ is fast?

This is because ZeroMQ is using TCP, a stream protocol optimized for throughput: Several messages are sent in the same package.

What is ZeroMQ Python?

ZeroMQ (also spelled ØMQ, 0MQ or ZMQ) is a high-performance asynchronous messaging library, aimed at use in distributed or concurrent applications. It provides a message queue, but unlike message-oriented middleware, a ZeroMQ system can run without a dedicated message broker.

What is a ZeroMQ context?

A ØMQ context is thread safe and may be shared among as many application threads as necessary, without any additional locking required on the part of the caller. Individual ØMQ sockets are not thread safe except in the case where full memory barriers are issued when migrating a socket from one thread to another.


1 Answers

I recommend to rerun your test with plain python or ipython (without console). Please also limit the counting to a single process with lsof -p <pid> to exclude unnecessary interference from other processes in your machine (those 1375 Unix socket files in your test).

Here is a simple test script:

import os
pid = os.getpid()
count=0

def lsof():
    global count
    count += 1
    print(count,':')
    os.system("lsof -p {0:d} 2>/dev/null | grep -E 'unix|IPv4|IPv6'".format(pid)) # -U doesn't work togeter with -p option
    # Alternatively, you can use "lsof -U 2>/dev/null | grep -E {0:d}"
    # but only unix socket file will be listed.

import zmq
c = zmq.Context();lsof()
tcp = c.socket(zmq.PUSH);lsof()
unix = c.socket(zmq.PUSH);lsof()

print('--- To bind  ---')
tcp.bind('tcp://127.0.0.1:19413');lsof()
unix.bind('ipc://filename');lsof()

print('--- To close ---')
tcp.close();lsof()
unix.close();lsof()

Below is the test result in my environment (python 3.6.6, pyzmq 17.1.2, w/ Anaconda in CentOS 7).

1 :
2 :
3 :
--- To bind  ---
4 :
ZMQbg/1 284018 gdlmx   13u     IPv4           49443178      0t0      TCP localhost:19413 (LISTEN)
5 :
ZMQbg/1 284018 gdlmx   13u     IPv4           49443178      0t0      TCP localhost:19413 (LISTEN)
ZMQbg/1 284018 gdlmx   14u     unix 0xffff9cd6c5bf4800      0t0 49443204 filename
--- To close ---
6 :
ZMQbg/1 284018 gdlmx   14u     unix 0xffff9cd6c5bf4800      0t0 49443204 filename
7 :

I've used python and ipython to run the script and got the same result.

To conclude, the socket file or network port is open only when socket.bind is called. No other socket is open by the python/ipython processes during my tests.

Update

In response to the update of PO:

The abnormal (unexpected) behavior is probably caused by the pre-built binaries bundled in the pyzmq package on PyPI. pip install pyzmq will download that distribution tar ball from PyPI, which contains the following pre-compiled binary files:

zmq/backend/cython:
    _device.so  _proxy_steerable.so  constants.so  error.so    socket.so
    _poll.so    _version.so          context.so    message.so  utils.so

zmq/.libs:
    libzmq-39117701.so.5.2.1         libsodium-72341b7d.so.23.2.0

To be compatible with as many Linux OS as possible, these binaries are built within a very old OS (CentOS 5) in a docker environment called manylinux.

Anaconda uses a different approach to pre-build the binaries and contains all dependencies in the conda/envs folder. So their binaries are built in a relatively up-to-date environment.

I tested the PyPI's binaries on my CentOS 7 machine with the above script. I can confirm that ZeroMQ opens some "background" sockets (2 sockets after context creation and 8 after the first socket creation). Although my tests below show that they are used for inter-threads communication for the internal mechanisms of ZeroMQ, it's better to directly ask the maintainers of the PyPI package.

You may also try to force pip/setuptools to build ZeroMQ for your OS:

sudo yum install libzmq3-devel #  RHEL-based
pip install --no-use-wheel pyzmq 
# Use `--no-binary :all:` instead of `--no-use-wheel` in pip >= 10.0.0

This might get rid of the background sockets, if that's what you want.

What's the purpose of the background sockets?

ZeroMQ internally uses multiple threads for the IO operation. The number of threads can be configured via IO_THREADS. I find that this number affects the number of sockets in use. Test it with

num_io_threads = int(sys.argv[1])
c = zmq.Context()
c.set(zmq.IO_THREADS,num_io_threads)
s = c.socket(zmq.PUSH)
lsof()

You will find that number_of_sockets = 6 + 2 * num_io_threads. Thus, I postulate that the ZeroMQ binaries from PyPI internally use sockets for inter-threads communication between the main thread and the worker/IO threads.

like image 96
gdlmx Avatar answered Oct 14 '22 03:10

gdlmx