Avoid duplicating sync and async code in Rust library

Question

I've recently came across a library that provides both sync and async interfaces. Async can be enabled with the async feature flag and the async/sync functions are distinguished with compiler directives.

E.g. here's how a sync function looks like:

#[cfg(not(feature = "async"))]
fn perform_query<A: ToSocketAddrs>(&self, payload: &[u8], addr: A) -> Result<Vec<u8>>
{
    // More than 100 lines of code with occasional calls to sync UdpSocket::send_to and recv.
}

And this is how an async function looks like:

#[cfg(feature = "async")]
async fn perform_query<A: ToSocketAddrs>(&self, payload: &[u8], addr: A) -> Result<Vec<u8>>
{
    // More than 100 lines of code with occasional calls to async UdpSocket::send_to and recv.
    // Apart from 3-4 await lines, it does mostly the same thing as its sync counterpart.
}

I found and fixed some bugs in the sync code and now I'm about to implement the fix to the async code as well. But then I noticed that since this large function is entirely duplicated, I'd need to patch my fixes into the async function, and then I started to think, why most of this function is duplicated in the first place? It seems like hell to maintain this code in the long run, so I thought to do a favor by deduplicating this function... Then I run into issues those let me aware that it's not as trivial as I thought. I can sure differentiate those few lines with compiler directives and I could even write a macro which would insert the sync / async versions of the UdpSocket calls depending on whether the async feature enabled. But then I realized I can't select the function headers via compiler directives, because #[cfg...] would apply to the entire function, so if I do something like this, I get massive syntax errors:

#[cfg(not(feature = "async"))]
fn perform_query<A: ToSocketAddrs>(&self, payload: &[u8], addr: A) -> Result<Vec<u8>>
#[cfg(feature = "async")]
async fn perform_query<A: ToSocketAddrs>(&self, payload: &[u8], addr: A) -> Result<Vec<u8>>
{
    // Deduplicated code with occasional differentiation of sync / async UdpSocket calls.
}

I also thought of having only the async function of the core and then async and sync wrapper functions to call it whether the library is being compiled as sync or async, but then I can't call an async function from a sync function, or at least I'd need to do some ugly magic using an async runtime to await / poll the function and then pass the result as sync, but then the sync build of the library would also have to import an async runtime anyway, which would be better to be avoided.

My current idea is to move the processing of the packets into separate sync functions those would be called from sync and async wrappers those would only deal with the actual UdpSocket calls, but I'm not sure if that's the right way to do that. I mean, isn't there a smoother, more elegant way? What is the general approach for this? Or is it normal to duplicate whopping functions for sync and async builds? As you may guess, I have no experience with async programming.

Kevin Reid · Accepted Answer

I also thought of having only the async function of the core and then async and sync wrapper functions to call it … then the sync build of the library would also have to import an async runtime anyway …

This is how reqwest offers its blocking interface. I think it's a perfectly good way to do things, if the library is big enough that the async runtime is not a large additional compilation cost. It has the advantage that all of your IO works exactly the same way in all cases, reducing the chances of subtle bugs.

My current idea is to move the processing of the packets into separate sync functions those would be called from sync and async wrappers those would only deal with the actual UdpSocket calls

I recommend that you take this option — separating the algorithms from the IO. It has advantages beyond the code de-duplication you are currently aiming for:

It is likely easier to write unit tests for the packet algorithms when you can express them as simple function calls — especially ones that handle edge cases in IO — than if you have to also set up a peer UDP socket to test anything.
If you make the algorithms public, it allows them to be used in unusual situations, such as ones not interacting with the operating system's networking stack:
- no_std environments where the networking is custom and not known to Rust std or your async IO library
- analysis of captures of traffic (non-real-time)
- reimplementing the IO side for special requirements (e.g. passing specific flags to the OS) while still being able to use your library's algorithms
Error handling, a necessary part of IO, may be clearer if it is less intertwined with the algorithms, as such a split would require

This style of library design is sometimes called “sans I/O” (at least by Python programmers). You can see it in Rust with, for example, the http library which provides HTTP parsing algorithms but no IO whatsoever.

Avoid duplicating sync and async code in Rust library

Tags:

asynchronous

async-await

rust

MegaBrutal

1 Answers

Kevin Reid

Recent Activity

Donate For Us

Avoid duplicating sync and async code in Rust library

Tags:

asynchronous

async-await

rust

MegaBrutal

1 Answers

Kevin Reid

Related questions

Recent Activity

Donate For Us