Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do asynchronous versions of a TCP echo server use 50x more memory than a synchronous one?

I have a simple TCP echo server using standard library:

use std::net::TcpListener;

fn main() {
    let listener = TcpListener::bind("localhost:4321").unwrap();
    loop {
        let (conn, _addr) = listener.accept().unwrap();
        std::io::copy(&mut &conn, &mut &conn).unwrap();
    }
}

It uses about 11 MB of memory:

standard library

Tokio

If I convert it to use tokio:

tokio = { version = "0.2.22", features = ["full"] }
use tokio::net::TcpListener;

#[tokio::main]
async fn main() {
    let mut listener = TcpListener::bind("localhost:4321").await.unwrap();
    loop {
        let (mut conn, _addr) = listener.accept().await.unwrap();
        let (read, write) = &mut conn.split();
        tokio::io::copy(read, write).await.unwrap();
    }
}

It uses 607 MB of memory:

tokio

async_std

Similarly, with async_std:

async-std = "1.6.2"
use async_std::net::TcpListener;

fn main() {
    async_std::task::block_on(async {
        let listener = TcpListener::bind("localhost:4321").await.unwrap();
        loop {
            let (conn, _addr) = listener.accept().await.unwrap();
            async_std::io::copy(&mut &conn, &mut &conn).await.unwrap();
        }
    });
}

It also uses 607 MB of memory:

async_std


Why do the asynchronous versions of the program use 55x more memory than the synchronous one?

like image 424
Gurwinder Singh Avatar asked Aug 04 '20 08:08

Gurwinder Singh


3 Answers

You should look at the RES column. One uses 1.0MB, the other uses 1.6MB.

Most of it is likely to be a constant overhead needed to start tokio runtime and a threadpool for it.

like image 68
Kornel Avatar answered Sep 22 '22 08:09

Kornel


I tried it here, and like you said in the comments, there are several 64MB blocks:

==> pmap -d $(pidof tokio)
3605:   target/release/tokio
Address           Kbytes Mode  Offset           Device    Mapping
…
0000555b2a634000     132 rw--- 0000000000000000 000:00000   [ anon ]
00007f2fec000000     132 rw--- 0000000000000000 000:00000   [ anon ]
00007f2fec021000   65404 ----- 0000000000000000 000:00000   [ anon ]
00007f2ff0000000     132 rw--- 0000000000000000 000:00000   [ anon ]
00007f2ff0021000   65404 ----- 0000000000000000 000:00000   [ anon ]
00007f2ff4000000     132 rw--- 0000000000000000 000:00000   [ anon ]
00007f2ff4021000   65404 ----- 0000000000000000 000:00000   [ anon ]
…

Those blocks are neither readable nor writable, so they aren't mapped and don't use any memory. They simply represent reserved address space.

Moreover as you can see, each of those 65404K block comes immediately after a 132K block. Since 65404+132 is exactly 65536, I suspect that these blocks represent address space that is reserved in case the runtime needs to grow one of those 132K-blocks later on. Might be interesting to see how things look after a couple of hours and a few thousand connections.

like image 24
Jmb Avatar answered Sep 24 '22 08:09

Jmb


The malloc implementation of glibc allocates a new block for each thread. The size of the block is specified by the compile-time constant HEAP_MAX_SIZE(Source). Because the tokio runtime spawns multiple threads it results in this high virtual memory usage.

To avoid this you can compile your rust program for the musl target with cargo build --target=x86_64-unknown-linux-musl.

After all, this is a optimization by glibc and not an effect by rust or the tokio runtime.

like image 28
koplas Avatar answered Sep 23 '22 08:09

koplas