Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can't explain poor bandwidth performance using boost asio TCP sockets

The following is an example I wrote of a trivial TCP server and matching client to start practicing boost's asio library, Example TCP Client/Server.

  • The client simply connects and sends data as fast as it can from a memory buffer.
  • The server solely listens for messages and prints how many bytes it got from a complete message.

That's it - nothing more. Booth examples run on a few threads, mostly default settings are used, there's no haphazardly placed sleeps that might be throwing things off... They're really fairly straightforward to follow, there's practically nothing other than a direct call to boost with the goal of isolating the problem.

The thing is, that the client's output is the following:

Mbytes/sec: 51.648908, Gbytes/sec: 0.051649, Mbits/sec: 413.191267, Gbits/sec: 0.413191

Notes:

  • I'm running my laptop off battery power right now. If I plug it into a power jack it'll jump to ~0.7 Gbits/sec.
  • I've tried sending small 2048 byte messages to the current 8 Mbyte messages.
  • I've tried with the nagle algorithm enabled and disabled.
  • I've tried resizing send and receive OS buffers.
  • All of this is running via loopback, 127.0.0.1.
  • Monitoring loopback via Wireshark shows the same low bandwidth usage.

The actual point that prompted me to write the question is this one. The iperf tool is able to achieve 33.0 Gbits/sec using TCP via localhost.

$ iperf --client 127.0.0.1
------------------------------------------------------------
Client connecting to 127.0.0.1, TCP port 5001
TCP window size: 2.50 MByte (default)
------------------------------------------------------------
[  3] local 127.0.0.1 port 41952 connected with 127.0.0.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  38.4 GBytes  33.0 Gbits/sec

All searches for something along the lines of "TCP boost asio performance" tend to wind up with suggestions to disable nagle or adjusting OS socket buffers.

If anyone could point me in the right direction to understand why I'm obtaining such low bandwidth performance using boost asio I would greatly appreciate it!

like image 758
JDR Avatar asked Oct 14 '25 15:10

JDR


1 Answers

First of all let me point out the things you are doing wrong for this kind of test.

  1. Manual setting of TCP buffer size

This is better left for the TCP algorithm to figure out what is the best size. This is usually determined during the TCP slow-start phase where the TCP algorithm finally decides on a best possible window-size based upon the congestion. Since we are using local host and that too point to point connection without any in between network components, the congestion would be near zero.

  1. Enabling Nagles algorithm

This is not actually needed since you are not sending framed packets of short length. It is usually then when turning on Nagles would give you some benefit on latency (for throughput I am not so sure if there would be any improvement).

  1. Unwanted processing on server side

I see that you are iterating over the received buffer and doing some kind of meaningless check. iperf for sure does not do that. I have commented out that section of code.

  1. Receiving application buffer size

I don't know but for some reason you have chosen to read only 2048 bytes per receive. Any particular reason for that? I have changed it back to the actual size that was being written by the client. You were probably just queueing more data on the server receive section.

New Server code:

#include <thread>
#include <chrono>
#include <vector>
#include <signal.h>
#include <asio.hpp>
#include <system_error>

namespace
{
bool keepGoing = true;
void shutdown(int)
{
        keepGoing = false;
}

std::size_t bytesAccum = 0;
void justReceive(std::error_code ec, std::size_t bytesReceived,
    asio::ip::tcp::socket &socket, std::vector<unsigned char> &buffer)
{
        bytesAccum += bytesReceived;
/*
        auto end = buffer.begin() + bytesReceived;
        for (auto it = buffer.begin(); it != end; ++it)
        {
                if (*it == 'e')
                {
                        std::printf("server got: %lu\n", bytesAccum);
                        bytesAccum = 0;
                }
        }
*/
        socket.async_receive(
            asio::buffer(buffer),
            0,
            [&] (auto ec, auto bytes) {
              justReceive(ec, bytes, socket, buffer);
            });
}
}

int main(int, char **)
{
        signal(SIGINT, shutdown);

        asio::io_service io;
        asio::io_service::work work(io);

        std::thread t1([&]() { io.run(); });
        std::thread t2([&]() { io.run(); });
        std::thread t3([&]() { io.run(); });
        std::thread t4([&]() { io.run(); });

        asio::ip::tcp::acceptor acceptor(io,
            asio::ip::tcp::endpoint(
                asio::ip::address::from_string("127.0.0.1"), 1234));
        asio::ip::tcp::socket socket(io);

        // accept 1 client
        std::vector<unsigned char> buffer(131072, 0);
        acceptor.async_accept(socket, [&socket, &buffer](std::error_code ec)
        {
            // options
            //socket.set_option(asio::ip::tcp::no_delay(true)); 
            //socket.set_option(asio::socket_base::receive_buffer_size(8192  * 2));
            //socket.set_option(asio::socket_base::send_buffer_size(8192));

            socket.async_receive(
                asio::buffer(buffer),
                0,
                [&](auto ec, auto bytes) {
                  justReceive(ec, bytes, socket, buffer);
                });
        });

        while (keepGoing)
        {
                std::this_thread::sleep_for(std::chrono::seconds(1));
        }

        io.stop();

        t1.join();
        t2.join();
        t3.join();
        t4.join();

        std::printf("server: goodbye\n");
}

New client code:

#include <thread>
#include <chrono>
#include <vector>
#include <signal.h>
#include <asio.hpp>
#include <system_error>

namespace
{
bool keepGoing = true;
void shutdown(int) { keepGoing = false; }
}

int main(int, char **)
{
        signal(SIGINT, shutdown);

        asio::io_service io;
        asio::io_service::work work(io);

        std::thread t1([&]() { io.run(); });
        std::thread t2([&]() { io.run(); });
        std::thread t3([&]() { io.run(); });
        std::thread t4([&]() { io.run(); });

        asio::ip::tcp::socket socket(io);
        auto endpoint = asio::ip::tcp::resolver(io).resolve({ 
            "127.0.0.1", "1234" });
        asio::connect(socket, endpoint);

        // options to test
        //socket.set_option(asio::ip::tcp::no_delay(true)); 
        //socket.set_option(asio::socket_base::receive_buffer_size(8192));
        //socket.set_option(asio::socket_base::send_buffer_size(8192 * 2));

        std::vector<unsigned char> buffer(131072, 0);
        buffer.back() = 'e';

        std::chrono::time_point<std::chrono::system_clock> last = 
            std::chrono::system_clock::now();

        std::chrono::duration<double> delta = std::chrono::seconds(0);

        std::size_t bytesSent = 0;

        while (keepGoing)
        {
                // blocks during send
                asio::write(socket, asio::buffer(buffer));
                //socket.send(asio::buffer(buffer));

                // accumulate bytes sent
                bytesSent += buffer.size();

                // accumulate time spent sending
                delta += std::chrono::system_clock::now() - last;
                last = std::chrono::system_clock::now();

                // print information periodically
                if (delta.count() >= 5.0) 
                {
                        std::printf("Mbytes/sec: %f, Gbytes/sec: %f, Mbits/sec: %f, Gbits/sec: %f\n",
                                    bytesSent / 1.0e6 / delta.count(),
                                    bytesSent / 1.0e9 / delta.count(),
                                    8 * bytesSent / 1.0e6 / delta.count(),
                                    8 * bytesSent / 1.0e9 / delta.count());

                        // reset accumulators
                        bytesSent = 0;
                        delta = std::chrono::seconds(0);
                }
        }

        io.stop();

        t1.join();
        t2.join();
        t3.join();
        t4.join();

        std::printf("client: goodbyte\n");
}

NOTE: I used standalone version of asio, but the results as reported by the OP were reproducible on my machine which is :

MacBook Pro Yosemite - 2.6 GHz Intel Core i5 Processor - 8GB DDR3 RAM .

like image 81
Arunmu Avatar answered Oct 17 '25 05:10

Arunmu



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!