Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using the majordomo broker with asynchronous clients

While reading the zeromq guide, I came across client code which sends 100k requests in a loop, and then receives the reply in a second loop.

#include "../include/mdp.h"
#include <time.h>


int main (int argc, char *argv [])
{
    int verbose = (argc > 1 && streq (argv [1], "-v"));
    mdp_client_t *session = mdp_client_new ("tcp://localhost:5555", verbose);
    int count;
    for (count = 0; count < 100000; count++) {
        zmsg_t *request = zmsg_new ();
        zmsg_pushstr (request, "Hello world");
        mdp_client_send (session, "echo", &request);
    }
    printf("sent all\n");

    for (count = 0; count < 100000; count++) {
        zmsg_t *reply = mdp_client_recv (session,NULL,NULL);
        if (reply)
            zmsg_destroy (&reply);
        else
            break;              //  Interrupted by Ctrl-C
        printf("reply received:%d\n", count);
    }
    printf ("%d replies received\n", count);
    mdp_client_destroy (&session);
    return 0;
}

I have added a counter to count the number of replies that the worker (test_worker.c) sends to the broker, and another counter in mdp_broker.c to count the number of replies the broker sends to a client. Both of these count up to 100k, but the client is receiving only around 37k replies.

If the number of client requests is set to around 40k, then it receives all the replies. Can someone please tell me why packets are lost when the client sends more than 40k asynchronous requests?

I tried setting the HWM to 100k for the broker socket, but the problem persists:

static broker_t *
s_broker_new (int verbose)
{
    broker_t *self = (broker_t *) zmalloc (sizeof (broker_t));
    int64_t hwm = 100000;
    //  Initialize broker state
    self->ctx = zctx_new ();
    self->socket = zsocket_new (self->ctx, ZMQ_ROUTER);
    zmq_setsockopt(self->socket, ZMQ_SNDHWM, &hwm, sizeof(hwm));

    zmq_setsockopt(self->socket, ZMQ_RCVHWM, &hwm, sizeof(hwm));
    self->verbose = verbose;
    self->services = zhash_new ();
    self->workers = zhash_new ();
    self->waiting = zlist_new ();
    self->heartbeat_at = zclock_time () + HEARTBEAT_INTERVAL;
    return self;
}
like image 661
user1274878 Avatar asked Dec 03 '14 03:12

user1274878


3 Answers

Without setting the HWM and using the default TCP settings, packet loss was being incurred with just 50k messages.

The following helped to mitigate the packet loss at the broker:

  1. Setting the HWM for the zeromq socket.
  2. Increasing the TCP send/receive buffer size.

This helped only up to a certain point. With two clients, each sending 100k messages, the broker was able to manage fine. But when the number of clients was increased to three, they stopped receiving all the replies.

Finally, what has helped me to ensure no packet loss is to change the design of the client code in the following way:

  1. A client can send upto N messages at once. The client's RCVHWM and broker's SNDHWM should be sufficiently high to hold a total of N messages.
  2. After that, for every reply received by the client, it sends two requests.
like image 74
user1274878 Avatar answered Nov 07 '22 23:11

user1274878


You send 100k messages, and then begin to receive them. Thus, the 100k messages should be stored in a buffer. When the buffer is exhausted and cannot store anymore messages, you reach the ZeroMQ's high water mark. Behaviour on high water mark is specified in ZeroMQ documentation.

In case of the above code, the broker may discard some of the messages since a majordomo broker uses the ROUTER socket. One of resolutions would be split the send/receive loops into separated threads

like image 32
Jihun Avatar answered Nov 07 '22 23:11

Jihun


Why lost?

In ZeroMQ v2.1, a default value for ZMQ_HWM was INF (infinity), which helped the said test to be somewhat meaningful but at a cost of heavy risk of memory-overflow crashes, as the buffer allocation policy was not constrained / controlled so as to hit some physical limit.

As of ZeroMQ v3.0+, ZMQ_SNDHWM / ZMQ_RCVHWM default to 1000, which can be set afterwards.

You may also read an explicit warning, that

ØMQ does not guarantee that the socket will accept as many as ZMQ_SNDHWM messages, and the actual limit may be as much as 60-70% lower depending on the flow of messages on the socket.

Will splitting the sending / receiving part into separate threads help?

No.

Quick fix?

Yes, for the purpose of demo-test experimenting, set again infinite high-water marks, but be carefull to avoid such practice in any production-grade software.

Why to test a ZeroMQ performance in this way?

As said above, the original demo-test seems to have some meaning in its v2.1 implementation.

Since those days, ZeroMQ have evolved a lot. A very nice reading for your particular interest about performance envelopes, that may please building your further insight into this domain is in step by step guide with code examples on ZeroMQ protocol overheads/performance case-study on large file transfers

... we already run into a problem: if we send too much data to the ROUTER socket, we can easily overflow it. The simple but stupid solution is to put an infinite high-water mark on the socket. It's stupid because we now have no protection against exhausting the server's memory. Yet without an infinite HWM, we risk losing chunks of large files.

Try this: set the HWM to 1,000 (in ZeroMQ v3.x this is the default) and then reduce the chunk size to 100K so we send 10K chunks in one go. Run the test, and you'll see it never finishes. As the zmq_socket() man page says with cheerful brutality, for the ROUTER socket: "ZMQ_HWM option action: Drop".

We have to control the amount of data the server sends up-front. There's no point in it sending more than the network can handle. Let's try sending one chunk at a time. In this version of the protocol, the client will explicitly say, "Give me chunk N", and the server will fetch that specific chunk from disk and send it.

The best part, as far as I know, is in the commented progress of the resulting performance to the "model 3" flow-control and one can learn a lot from the great chapters and real-life remarks in the ZeroMQ Guide.

like image 1
user3666197 Avatar answered Nov 07 '22 23:11

user3666197