Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do any boost::asio async calls automatically time out?

Tags:

I have a client and server using boost::asio asynchronously. I want to add some timeouts to close the connection and potentially retry if something goes wrong.

My initial thought was that any time I call an async_ function I should also start a deadline_timer to expire after I expect the async operation to complete. Now I'm wondering if that is strictly necessary in every case.

For example:

  • async_resolve presumably uses the system's resolver which has timeouts built into it (e.g. RES_TIMEOUT in resolv.h possibly overridden by configuration in /etc/resolv.conf). By adding my own timer, I may conflict with how the user wants his resolver to work.

  • For async_connect, the connect(2) syscall has some sort of timeout built into it

  • etc.

So which (if any) async_ calls are guaranteed to call their handlers within a "reasonable" time frame? And if an operation [can|does] timeout would the handler be passed the basic_errors::timed_out error or something else?

like image 723
eater Avatar asked Feb 07 '11 18:02

eater


2 Answers

So I did some testing. Based on my results, it's clear that they depend on the underlying OS implementation. For reference, I tested this with a stock Fedora kernel: 2.6.35.10-74.fc14.x86_64.

The bottom line is that async_resolve() looks to be the only case where you might be able to get away without setting a deadline_timer. It's practically required in every other case for reasonable behavior.


async_resolve()

A call to async_resolve() resulted in 4 queries 5 seconds apart. The handler was called 20 seconds after the request with the error boost::asio::error::host_not_found.

My resolver defaults to a timeout of 5 seconds with 2 attempts (resolv.h), so it appears to send twice the number of queries configured. The behavior is modifiable by setting options timeout and options attempts in /etc/resolv.conf. In every case the number of queries sent was double whatever attempts was set to and the handler was called with the host_not_found error afterwards.

For the test, the single configured nameserver was black-hole routed.


async_connect()

Calling async_connect() with a black-hole-routed destination resulted in the handler being called with the error boost::asio::error::timed_out after ~189 seconds.

The stack sent the initial SYN and 5 retries. The first retry was sent after 3 seconds, with the retry timeout doubling each time (3+6+12+24+48+96=189). The number of retries can be changed:

% sysctl net.ipv4.tcp_syn_retries
net.ipv4.tcp_syn_retries = 5

The default of 5 is chosen to comply with RFC 1122 (4.2.3.5):

[The retransmission timers] for a SYN segment MUST be set large enough to provide retransmission of the segment for at least 3 minutes. The application can close the connection (i.e., give up on the open attempt) sooner, of course.

3 minutes = 180 seconds, though the RFC doesn't appear to specify an upper bound. There's nothing stopping an implementation from retrying forever.


async_write()

As long as the socket's send buffer wasn't full, this handler was always called right away.

My test established a TCP connection and set a timer to call async_write() a minute later. During the minute where the connection was established but prior to the async_write() call, I tried all sorts of mayhem:

  • Setting a downstream router to black-hole subsequent traffic to the destination.
  • Clearing the session in a downstream firewall so it would reply with spoofed RSTs from the destination.
  • Unplugging my Ethernet
  • Running /etc/init.d/network stop

No matter what I did, the next async_write() would immediately call its handler to report success.

In the case where the firewall spoofed the RST, the connection was closed immediately, but I had no way of knowing that until I attempted the next operation (which would immediately report boost::asio::error::connection_reset). In the other cases, the connection would remain open and not report errors to me until it eventually timed out 17-18 minutes later.

The worst case for async_write() is if the host is retransmitting and the send buffer is full. If the buffer is full, async_write() won't call its handler until the retransmissions time out. Linux defaults to 15 retransmissions:

% sysctl net.ipv4.tcp_retries2
net.ipv4.tcp_retries2 = 15

The time between the retransmissions increases after each (and is based on many factors such as the estimated round-trip time of the specific connection) but is clamped at 2 minutes. So with the default 15 retransmissions and worst-case 2-minute timeout, the upper bound is 30 minutes for the async_write() handler to be called. When it is called, error is set to boost::asio::error::timed_out.


async_read()

This should never call its handler as long as the connection is established and no data is received. I haven't had time to test it.

like image 78
eater Avatar answered Sep 23 '22 12:09

eater


Those two calls MAY have timeouts that get propigated up to your handlers, but you might be supprised at the length of time it takes before either of those times out. (I know I have let a connection just sit and try to connect on a single connect call for over 10 minutes with boost::asio before killing the process). Also the async_read and async_write calls do not have timeouts associated with them, so if you wish to have timeouts on your reads and writes, you will still need a deadline_timer.

like image 28
diverscuba23 Avatar answered Sep 26 '22 12:09

diverscuba23