Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I generate trait bounds in a declarative macro?

I have a trait with a large number of associated types. I want a function that uses those associated types on both sides of a where clause bound:

trait Kind {
    type A;
    type B;
    // 20+ more types
}

trait Bound<T> {}

fn example<K1, K2>()
where
    K1: Kind,
    K2: Kind,
    K1::A: Bound<K2::A>,
    K1::B: Bound<K2::B>,
    // 20+ more bounds
{
}

Typing out all the bounds will be slightly brittle, so I'd like to create a macro to generate this:

fn example<K1, K2>()
where
    K1: Kind,
    K2: Kind,
    how_do_i_write_this!(K1, K2, Bound, [A, B, /* 20+ more types */])
{
}

However, calling a macro on the right-hand side of the where clause bound results in an error:

macro_rules! bound {
    () => { std::fmt::Debug };
}

fn another_example() 
    where
    u8: bound!(),
{}
error: expected one of `(`, `+`, `,`, `::`, `;`, `<`, or `{`, found `!`
 --> src/lib.rs:7:14
  |
7 |     u8: bound!(),
  |              ^ expected one of 7 possible tokens

Is there any clever macro trickery that will allow me to DRY up this code?

I'm OK with the exact placement or arguments of the macro changing. For example, the macro generating the entire fn would be acceptable.

If this is not possible, I can use a build script, but I'd rather keep the code co-located if possible.

like image 794
Shepmaster Avatar asked Aug 10 '21 13:08

Shepmaster


Video Answer


2 Answers

Solution (TL,DR)

  • The macro that "emits" the desired bounds

    macro_rules! with_generated_bounds {( $($rules:tt)* ) => (
        macro_rules! __emit__ { $($rules)* }
        __emit__! {
            K1: Kind,
            K2: Kind,
            K1::A: Bound<K2::A>,
            K1::B: Bound<K2::B>,
            // 20+ more bounds
        }
    )}
    
  • The (downstream) user's API

    with_generated_bounds! {( $($bounds:tt)* ) => (
        fn example<K1, K2>()
        where
            K1 : Kind,
            K2 : Kind,
            $($bounds)*
        { … }
    
        trait AnotherExample<K1 : Kind, K2 : Kind>
        where
            $($bounds)*
        { … }    
    )}
    

Explanation

This is an alternative to sk_pleasant's answer, where they rightfully point out that all macros (including procedural ones, for those wondering), have a limited amount of allowed call sites.

  • The best known example of this limitation is the concat_idents! macro (or any easy to write procedural macro polyfill of such): while it is possible to have a macro expand to a (concatenated) identifier, you are not allowed to call a macro between the fn keyword and the rest of the function definition, thus making concat_idents! useless to define new functions (and the same limitation makes such a macro unusable to define new types, etc.).

    And how do people circumvent the concat_idents! limitation? The most widespread tool / crate out there to tackle this is ::paste, with an eponymous macro.

    The syntax of the macro is special. Rather than writing:

    fn
    some_super_fancy_concat_idents![foo, bar]
    (args…)
    { body… }
    

    since, as I mentioned, this is not possible, ::paste::paste!'s idea is to be called in a place where macro calls are allowed, such as when expanding to a whole item, and thus to require that it wrap the whole function definition:

    outer_macro! {
        fn
        /* some special syntax here to signal to `outer_macro!` the intent
           to concatenate the identifiers `foo` and `bar`. */
        (args…)
        { body… }
    }
    

    e.g.,

    ::paste::paste! {
        fn [< foo bar >] (args…) {
            body…
        }
    }
    

    When we come to think about this, thanks to the outer macro which sees the whole input "code" as arbitrary tokens (not necessarily Rust code!), we get to support imaginary syntaxes such as [< … >], or even a syntax imitating (and faking!) macro calls but which in reality are just a syntactical designator much like [< … >] was. That is, paste!'s API could have been:

    imaginary::paste! { // <- preprocessor
        // not a real macro call,
        // just a syntactical designator
        // vvvvvvvvvvvvvvvvvvvvvvvv
        fn concat_idents!(foo, bar) (args…) { body… }
    }
    

    The two key ideas with this whole thing are:

    • By using an outer call which wraps the whole function definition (an item), we get to avoid worrying about macro call sites 🙂

    • We also get to feature our own arbitrary syntax and rules, such as pseudo macros.

    These are the core ideas of the preprocessor pattern.

At this point, similar to paste!, a proc-macro approach with the following API could be envisioned:

my_own_preprocessor! {
    #![define_pseudo_macro(my_bounds := {
        K1: Kind,
        K2: Kind,
        K1::A: Bound<K2::A>,
        K1::B: Bound<K2::B>,
        // 20+ more bounds
    })]

    fn example<K1, K2>()
    where
        K1: Kind,
        K2: Kind,
        my_bounds!() // <- fake macro / syntactical designator for `…preprocessor!`
    …

    trait AnotherExample<K1 : Kind, K2 : Kind>
    where
        my_bounds!() // <- ditto
    {}
}

This could be done, but implementing the helper proc-macro (my_own_preprocessor!) is non-trivial.


There is another approach which is similar to the preprocessor pattern, but which, in this instance, is way easier to feature. It's the macro-targeted callbacks / Continuation-Passing Style (CPS) pattern. Such a pattern currently appears from time to time, but is a tad cumbersome. The idea is that the tokens that we wish to "emit", rather than emitted, are passed to another macro –one provided by the caller!– which is the one finally responsible to handle these tokens and emit a valid macro expansion –such as a bunch of items/functions– accordingly.

For instance, consider doing:

macro_rules! emit_defs {(
    $($bounds:tt)*
) => (
    fn example<K1, K2>()
    where
        K1 : Kind,
        K2 : Kind,
        $($bounds)*
    { … }

    trait AnotherExample<K1 : Kind, K2 : Kind>
    where
        $($bounds)*
    { … }
)}

generate_bounds!(=> emit_defs!);

If this seems like an awkward but acceptable API, then you should know that implementing the body of generate_bounds! is super trivial! Indeed, it's just:

macro_rules! generate_bounds {(
    => $macro_name:ident !
    /* Optionally, we could try to support a fully qualified macro path */
) => (
    $macro_name! {
        K1::A: Bound<K2::A>,
        K1::B: Bound<K2::B>,
        // 20+ more bounds
    }
)}

Compare this to the naïve definition of our macro:

macro_rules! generate_bounds {() => (
    K1::A: Bound<K2::A>,
    K1::B: Bound<K2::B>,
    // 20+ more bounds
)}

The only difference is that we take a macro (which will be fed our returned "value") as input, and that we wrap our "returned" code within an invocation of it.

At this point I suggest to pause and stare at the previous snippets. The conceptual simplicity (even if noisy) and power of callback-based patterns can often be outstanding, and this is no exception!


This is already pretty good, and a solution already which can sometimes be spotted in the Rust ecosystem.

But, imho, this is not good enough: the ergonomics of the user are pretty terrible. Why should the caller be going through all the trouble of defining a helper macro, which may kind of interrupt the flow of defining the functions they wanted to define? And how should that macro be named? It doesn't really matter, it's a fire and forget "callback" macro!

  • We are reaching very similar issues to those that had to define callbacks in C (even stateless ones): instead of writing

    with(iterator, |each_element: ElementTy| {
        …
    });
    

    at the time, C had to write something equivalent to Rust's:

    fn handle_element(each_element: ElementTy) {
        …
    }
    
    with(iterator, handle_element);
    

    Compare it to our situation:

    macro_rules! handle_bounds {( $($bounds:tt)* ) => (
        fn example…
        where
            $($bounds)*
        …
    )}
    
    generate_bounds!(=> handle_bounds!);
    

From here, it's pretty easy to come up with the desired API. Something along the lines of:

with_generated_bounds! {( $($bounds:tt)* ) => (
    fn example…
    where
        $($bounds)*
    …
)}

And featuring this API from the "named callback" one (the => macro_name! one), is actually quite straight-forward: if we stare at the two previous snippets, we can notice that the "callback" the caller provided is exactly the body of a macro_rules! definition.

We can thus defined the "helper" macro ourselves (the callee), with the caller-provided rule(s), and then call this helper macro on the code we wished to emit.

This leads to the solution featured at the beginning of this post (repeated for convenience 🙃):

  • The macro that "emits" the desired bounds

    macro_rules! with_generated_bounds {( $($rules:tt)* ) => (
        /// The helper "callback" macro
        macro_rules! __emit__ { $($rules)* }
    
        __emit__! {
            K1: Kind,
            K2: Kind,
            K1::A: Bound<K2::A>,
            K1::B: Bound<K2::B>,
            // 20+ more bounds
        }
    )}
    
  • The (downstream) user's API

    with_generated_bounds! {( $($bounds:tt)* ) => (
        fn example<K1, K2>()
        where
            K1 : Kind,
            K2 : Kind,
            $($bounds)*
        { … }
    
        trait AnotherExample<K1 : Kind, K2 : Kind>
        where
            $($bounds)*
        { … }    
    )}
    

And voilà 🙂

Quid of this pattern while taking actual macro args?

e.g., the aforementioned example is hard-coding the names K1, K2. What about taking those as parameters?

  • The user API would be along the lines of:

    with_bounds_for! { K1, K2, ( $($bounds:tt)* ) => (
        fn example<K1, K2>()
        where
            $($bounds)*
        …
    )}
    
  • The inlined-callback-pattern macro would then be:

    macro_rules! with_bounds_for {(
        $K1:ident, $K2:ident, $($rules:tt)*
    ) => (
        macro_rules! __emit__ { $($rules)* }
        __emit__! {
            $K1 : Kind,
            $K2 : Kind,
            …
        }
    )}
    

Some remark(s)

Note that the expansion of with_generated_bounds! is that of:

  • a macro definition;

  • a macro invocation.

These are two "statements", which thus means the whole expansion of the macro is a "statement" itself, which means the following won't work:

macro_rules! with_42 {( $($rules:tt)* ) => (
    macro_rules! __emit__ { $($rules)* }
    __emit__! { 42 }
)}

//      this macro invocation expands to two "statements";
//      it is thus a statement / `()`-evaluating expression itself
//      vvvvvvvvvv
let x = with_42! {( $ft:expr ) => (
    $ft + 27
)};

This is nihil novi sub sole / nothing new under the sun; it's the same issue as with:

macro_rules! example {() => (
    let ft = 42; // <- one "statement"
    ft + 27      // <- an expression
)}

let x = example!(); // Error

And in that case the solution is easy: wrap the statements within braces, so as to emit a block, which can thus evaluate to its last expression:

macro_rules! example {() => ({
    let ft = 42;
    ft + 27
})}
  • (Incidentally, this is the reason I prefer to use => ( … ) as the right hand side of macro rules; it's way less error-prone / footgunny than => { … }).

In that case, the same solution applies to the callback pattern:

macro_rules! with_ft {( $($rules:tt)* ) => ({
    macro_rules! __emit__ { $($rules)* }
    __emit__! { 42 }
})}
// OK
let x = with_ft! {( $ft:expr ) => (
    $ft + 27
)};

This makes the macro be expr-friendly, but at the cost of leading to a scoped block for item definitions:

// Now the following fails!
with_ft! {( $ft:expr ) => (
    fn get_ft() -> i32 {
        $ft
    }
)}
get_ft(); // Error, no `get_ft` in this scope

Indeed, the definition of get_ft was now scoped within braces 😕

This is thus the main limitation of the inlined/anonymous callback-pattern: while it's powerful enough to emulate "arbitrary expansions" and "arbitrary call sites", it's limited to having to choose, beforehand, whether it wraps the macro definition within a braced block or not, which makes it compatible with either expression-expanding macros, or public-items-expanding macros. In that regard, the slightly more cumbersome named callback pattern, featured in the middle of this post (=> macro_name! syntax) doesn't have this problem.

like image 132
Daniel H-M Avatar answered Oct 27 '22 08:10

Daniel H-M


Quote from the Rust Reference on macros:

Macros may be invoked in the following situations:

  • Expressions and statements
  • Patterns
  • Types
  • Items including associated items
  • macro_rules transcribers
  • External blocks

According to this, is not possible to invoke a macro in the context of a trait bound, so you can't have the exact syntax you used. However, you can invoke a macro in the context of an item and have the macro generate the function including the trait bounds:

trait Kind {
    type A;
    type B;
    // 20+ more types
}

trait Bound<T> {}

macro_rules! generate_func_with_bounds {
    (
        fn $name:ident <$($gens:ident),*> ()
        where
            $($bound_type:ident: $bound_to:ident),*,
            @generate_bounds($first_type:ident, $second_type:ident, $trait:ident, [$($assoc:ident),*])
        {
            $($body:tt)*
        }
    ) => {
        fn $name <$($gens),*> ()
        where
            $($bound_type: $bound_to),*,
            $($first_type::$assoc: $trait<$second_type::$assoc>),*
        {
            $($body)*
        }
    };
}

generate_func_with_bounds!{
    fn example<K1, K2>()
    where
        K1: Kind,
        K2: Kind,
        @generate_bounds(K1, K2, Bound, [A, B])
    {
    }
}

Playground

This has the signature you want. Note that you may need to slightly modify the matcher if you want this to work with other functions (for example, functions with parameters, functions that use generic lifetimes and the like – anything that isn't more or less syntactically equivalent to the example() declaration).

like image 30
Elias Holzmann Avatar answered Oct 27 '22 09:10

Elias Holzmann