Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ZeroMQ multithreading: create sockets on-demand or use sockets object pool?

I'm building a POC leveraging ZeroMQ N-to-N pub/sub model. From our app server, when a http request is serviced, if the thread pulls data from the database, it updates a local memcache instance with that data. To synchronize other memcache instances in the app server cluster, the request thread sends a message with the data using a ZMQ publisher...so the question is: What strategy is the most effective with respect to minimizing socket create/destory overhead when the application has many threads that depend on sockets for sending messages? Do we share a pool of sockets, do we create/destroy sockets per thread, etc?

Strategy 1 - Thread-managed Publisher Socket
In this approach, each thread, T1, T2, and T3, manages the lifecycle of a socket object (publisher) by creating it, making the connection, sending a message, and finally closing the socket. Based on this, it's certainly the safest approach, but we have concerns with respect to overhead when sockets are created, connected, and destroyed repeatedly; if the overhead negatively impacts performance, we'd like to avoid it.

enter image description here

Strategy 2 - Publisher Sockets Object Pool
In this approach, the parent process (app server) initializes a pool of ZMQ publishers on startup. When a thread needs a publisher, it gets one from the object pool, sends its message, then returns the publisher to the pool; the process of creating, connecting and destroying sockets is eliminated with respect to the thread using the publisher, but access to the pool is synchronized to avoid any two threads using the same publisher object at the same time, and this is where deadlocks and concurrency issues may arise.

We have not profiled either approach because wanted to do a litmus on SO test first. With respect to volume, our application in not publish "heavy", but there could be between 100-150 threads (per app server) at the same time with the need to publish a message.

ZMQ Publisher Object Pool

So, to reiterate: What strategy is the most effective with respect to minimizing overhead while emphasizing performance when the application has many threads that depend on publishers for sending messages?

like image 858
raffian Avatar asked May 20 '13 22:05

raffian


People also ask

Does ZeroMQ use sockets?

ZeroMQ patterns are implemented by pairs of sockets with matching types. The built-in core ZeroMQ patterns are: Request-reply, which connects a set of clients to a set of services. This is a remote procedure call and task distribution pattern.

Is ZeroMQ thread safe?

Thread safetyA ØMQ context is thread safe and may be shared among as many application threads as necessary, without any additional locking required on the part of the caller.

What is Zmq poll?

The zmq_poll() function provides a mechanism for applications to multiplex input/output events in a level-triggered fashion over a set of sockets. Each member of the array pointed to by the items argument is a zmq_pollitem_t structure. The nitems argument specifies the number of items in the items array.


2 Answers

You can't really ask a question about performance without providing real figures for your estimated throughput. Are we talking about 10 requests per second, 100, 1,000, 10K?

If the HTTP server is really creating and destroying threads for each request, then creating 0MQ sockets repeatedly will stress the OS and depending on the volume of requests and your process limits, it'll work, or it'll run out of handles. You can test this trivially and thats a first step.

Then, sharing a pool of sockets (what you mean by "ZMQ publisher") is nasty. People do this but sockets are not threadsafe so it means being very careful when you switch a socket to another thread.

If there is a way to keep the threads persistent then each one can create its PUB socket if it needs to, and hold onto it as long as it exists. If not, then my first design would create/destroy sockets anyhow, but use inproc:// to send messages to a single permanent forwarder thread (a SUB-PUB proxy). I'd test this and then if it breaks, go for more exotic designs.

In general it's better to make the simplest design and break it, than to over-think the design process (especially when starting out).

like image 58
Pieter Hintjens Avatar answered Oct 20 '22 17:10

Pieter Hintjens


It sounds like premature optimization to me too, and if at all possible, you should stick with the first strategy and save yourself the headaches.

But as an alternative to your second option, you could perhaps maintain an Executor thread pool inside your application to do the actual zmq sending. This way each executor thread can keep its own socket. You can listen to application/servlet life cycle events to know when to shutdown the pool and cleanup the sockets.

EDIT:

The simplest way to do this is to create the Executor using Executors.newFixedThreadPool() and feed it Runnable jobs that use a ThreadLocal socket. (See Java Executors and per-thread (not per-work unit) objects? ) The threads will be created only once and reused from then on until the Executor is shutdown.

This gets a little tricky when an exception is thrown in the job's run() method. I suspect you'll find you need a little bit more control over the executor threads' lifecycle. If so, you can copy the source for newFixedThreadPool:

return new ThreadPoolExecutor(nThreads, nThreads,
                              0L, TimeUnit.MILLISECONDS,
                              new LinkedBlockingQueue<Runnable>());

and subclass the ThreadPoolExecutor that gets instantiated to customize it. This way you could for example override afterExecute to detect and clean up broken sockets.

The send jobs get transferred to the worker threads through a blocking queue. I realise that this is not the ZeroMQ way to hand off the messages to the worker threads, which would be inproc messaging. This moves ZeroMQ away from the HTTP worker threads whose lifecycle is out of your control and therefore hard to maintain sockets in, more towards the edge of the application. You'd have to simply test which of the both is more efficient and have to make a judgement call on how rigorously you want your application to adopt the ZeroMQ messaging paradigm for inter-thread communication.

like image 1
flup Avatar answered Oct 20 '22 18:10

flup