I have a simple TCP echo server using standard library:
use std::net::TcpListener;
fn main() {
let listener = TcpListener::bind("localhost:4321").unwrap();
loop {
let (conn, _addr) = listener.accept().unwrap();
std::io::copy(&mut &conn, &mut &conn).unwrap();
}
}
It uses about 11 MB of memory:
If I convert it to use tokio:
tokio = { version = "0.2.22", features = ["full"] }
use tokio::net::TcpListener;
#[tokio::main]
async fn main() {
let mut listener = TcpListener::bind("localhost:4321").await.unwrap();
loop {
let (mut conn, _addr) = listener.accept().await.unwrap();
let (read, write) = &mut conn.split();
tokio::io::copy(read, write).await.unwrap();
}
}
It uses 607 MB of memory:
Similarly, with async_std:
async-std = "1.6.2"
use async_std::net::TcpListener;
fn main() {
async_std::task::block_on(async {
let listener = TcpListener::bind("localhost:4321").await.unwrap();
loop {
let (conn, _addr) = listener.accept().await.unwrap();
async_std::io::copy(&mut &conn, &mut &conn).await.unwrap();
}
});
}
It also uses 607 MB of memory:
Why do the asynchronous versions of the program use 55x more memory than the synchronous one?
You should look at the RES
column. One uses 1.0MB, the other uses 1.6MB.
Most of it is likely to be a constant overhead needed to start tokio runtime and a threadpool for it.
I tried it here, and like you said in the comments, there are several 64MB blocks:
==> pmap -d $(pidof tokio)
3605: target/release/tokio
Address Kbytes Mode Offset Device Mapping
…
0000555b2a634000 132 rw--- 0000000000000000 000:00000 [ anon ]
00007f2fec000000 132 rw--- 0000000000000000 000:00000 [ anon ]
00007f2fec021000 65404 ----- 0000000000000000 000:00000 [ anon ]
00007f2ff0000000 132 rw--- 0000000000000000 000:00000 [ anon ]
00007f2ff0021000 65404 ----- 0000000000000000 000:00000 [ anon ]
00007f2ff4000000 132 rw--- 0000000000000000 000:00000 [ anon ]
00007f2ff4021000 65404 ----- 0000000000000000 000:00000 [ anon ]
…
Those blocks are neither readable nor writable, so they aren't mapped and don't use any memory. They simply represent reserved address space.
Moreover as you can see, each of those 65404K block comes immediately after a 132K block. Since 65404+132 is exactly 65536, I suspect that these blocks represent address space that is reserved in case the runtime needs to grow one of those 132K-blocks later on. Might be interesting to see how things look after a couple of hours and a few thousand connections.
The malloc implementation of glibc allocates a new block for each thread. The size of the block is specified by the compile-time constant HEAP_MAX_SIZE(Source). Because the tokio runtime spawns multiple threads it results in this high virtual memory usage.
To avoid this you can compile your rust program for the musl target with cargo build --target=x86_64-unknown-linux-musl
.
After all, this is a optimization by glibc and not an effect by rust or the tokio runtime.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With