Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I create hygienic identifiers in code generated by procedural macros?

When writing a declarative (macro_rules!) macro, we automatically get macro hygiene. In this example, I declare a variable named f in the macro and pass in an identifier f which becomes a local variable:

macro_rules! decl_example {
    ($tname:ident, $mname:ident, ($($fstr:tt),*)) => {
        impl std::fmt::Display for $tname {
            fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
                let Self { $mname } = self;
                write!(f, $($fstr),*)
            }
        }
    }
}

struct Foo {
    f: String,
}

decl_example!(Foo, f, ("I am a Foo: {}", f));

fn main() {
    let f = Foo {
        f: "with a member named `f`".into(),
    };
    println!("{}", f);
}

This code compiles, but if you look at the partially-expanded code, you can see that there's an apparent conflict:

impl std::fmt::Display for Foo {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        let Self { f } = self;
        write!(f, "I am a Foo: {}", f)
    }
}

I am writing the equivalent of this declarative macro as a procedural macro, but do not know how to avoid potential name conflicts between the user-provided identifiers and identifiers created by my macro. As far as I can see, the generated code has no notion of hygiene and is just a string:

src/main.rs

use my_derive::MyDerive;

#[derive(MyDerive)]
#[my_derive(f)]
struct Foo {
    f: String,
}

fn main() {
    let f = Foo {
        f: "with a member named `f`".into(),
    };
    println!("{}", f);
}

Cargo.toml

[package]
name = "example"
version = "0.1.0"
edition = "2018"

[dependencies]
my_derive = { path = "my_derive" }

my_derive/src/lib.rs

extern crate proc_macro;

use proc_macro::TokenStream;
use quote::quote;
use syn::{parse_macro_input, DeriveInput, Meta, NestedMeta};

#[proc_macro_derive(MyDerive, attributes(my_derive))]
pub fn my_macro(input: TokenStream) -> TokenStream {
    let input = parse_macro_input!(input as DeriveInput);

    let name = input.ident;

    let attr = input.attrs.into_iter().filter(|a| a.path.is_ident("my_derive")).next().expect("No name passed");
    let meta = attr.parse_meta().expect("Unknown attribute format");
    let meta = match meta {
        Meta::List(ml) => ml,
        _ => panic!("Invalid attribute format"),
    };
    let meta = meta.nested.first().expect("Must have one path");
    let meta = match meta {
        NestedMeta::Meta(Meta::Path(p)) => p,
        _ => panic!("Invalid nested attribute format"),
    };
    let field_name = meta.get_ident().expect("Not an ident");

    let expanded = quote! {
        impl std::fmt::Display for #name {
            fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
                let Self { #field_name } = self;
                write!(f, "I am a Foo: {}", #field_name)
            }
        }
    };

    TokenStream::from(expanded)
}

my_derive/Cargo.toml

[package]
name = "my_derive"
version = "0.1.0"
edition = "2018"

[lib]
proc-macro = true

[dependencies]
syn = "1.0.13"
quote = "1.0.2"
proc-macro2 = "1.0.7"

With Rust 1.40, this produces the compiler error:

error[E0599]: no method named `write_fmt` found for type `&std::string::String` in the current scope
 --> src/main.rs:3:10
  |
3 | #[derive(MyDerive)]
  |          ^^^^^^^^ method not found in `&std::string::String`
  |
  = help: items from traits can only be used if the trait is in scope
  = note: this error originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)
help: the following trait is implemented but not in scope; perhaps add a `use` for it:
  |
1 | use std::fmt::Write;
  |

What techniques exist to namespace my identifiers from identifiers outside of my control?

like image 263
Shepmaster Avatar asked Jan 06 '20 19:01

Shepmaster


2 Answers

Summary: you can't yet use hygienic identifiers with proc macros on stable Rust. Your best bet is to use a particularly ugly name such as __your_crate_your_name.


You are creating identifiers (in particular, f) by using quote!. This is certainly convenient, but it's just a helper around the actual proc macro API the compiler offers. So let's take a look at that API to see how we can create identifiers! In the end we need a TokenStream, as that's what our proc macro returns. How can we construct such a token stream?

We can parse it from a string, e.g. "let f = 3;".parse::<TokenStream>(). But this was basically an early solution and is discouraged now. In any case, all identifiers created this way behave in a non-hygienic manner, so this won't solve your problem.

The second way (which quote! uses under the hood) is to create a TokenStream manually by creating a bunch of TokenTrees. One kind of TokenTree is an Ident (identifier). We can create an Ident via new:

fn new(string: &str, span: Span) -> Ident

The string parameter is self explanatory, but the span parameter is the interesting part! A Span stores the location of something in the source code and is usually used for error reporting (in order for rustc to point to the misspelled variable name, for example). But in the Rust compiler, spans carry more than location information: the kind of hygiene! We can see two constructor functions for Span:

  • fn call_site() -> Span: creates a span with call site hygiene. This is what you call "unhygienic" and is equivalent to "copy and pasting". If two identifiers have the same string, they will collide or shadow each other.

  • fn def_site() -> Span: this is what you are after. Technically called definition site hygiene, this is what you call "hygienic". The identifiers you define and the ones of your user live in different universes and won't ever collide. As you can see in the docs, this method is still unstable and thus only usable on a nightly compiler. Bummer!

There are no really great workarounds. The obvious one is to use a really ugly name like __your_crate_some_variable. To make it a bit easier for you, you can create that identifier once and use it within quote! (slightly better solution here):

let ugly_name = quote! { __your_crate_some_variable };
quote! {
    let #ugly_name = 3;
    println!("{}", #ugly_name);
}

Sometimes you can even search through all identifiers of the user that could collide with yours and then simply algorithmically chose an identifier that does not collide. This is actually what we did for auto_impl, with a fallback super ugly name. This was mainly to improve the generated documentation from having super ugly names in it.

Apart from that, I'm afraid you cannot really do anything.

like image 170
Lukas Kalbertodt Avatar answered Oct 23 '22 06:10

Lukas Kalbertodt


You can thanks to a UUID:

fn generate_unique_ident(prefix: &str) -> Ident {
    let uuid = uuid::Uuid::new_v4();
    let ident = format!("{}_{}", prefix, uuid).replace('-', "_");

    Ident::new(&ident, Span::call_site())
}
like image 5
Boiethios Avatar answered Oct 23 '22 07:10

Boiethios