Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to write a Rust compiler plugin that generates a module?

I am writing a Rust compiler plugin that expands

choose! {
    test_a
    test_b
}

to

#[cfg(feature = "a")]
mod test_a;
#[cfg(feature = "b")]
mod test_b;

It is almost done now, but the module contains nothing in the finally expanded code. I guess the reason is the span doesn't cover the module file.

use syntax::ast;
use syntax::ptr::P;
use syntax::codemap;
use syntax::parse::token;
use syntax::tokenstream::TokenTree;
use syntax::ext::base::{ExtCtxt, MacResult, DummyResult, MacEager};
use syntax::ext::build::AstBuilder;
use syntax_pos::Span;
use rustc_plugin::Registry;
use syntax::util::small_vector::SmallVector;

// Ideally, it will expand
//
// ```rust
// choose! {
//   test_a
//   test_b
// }
// ```
// to
// ```rust
// #[cfg(feature = "a")]
// mod test_a;
// #[cfg(feature = "b")]
// mod test_b;
// ```
//
// but the modules contain nothing in the expanded code at present

fn choose(cx: &mut ExtCtxt, sp: Span, args: &[TokenTree]) -> Box<MacResult + 'static> {
    let mut test_mods: SmallVector<P<ast::Item>> = SmallVector::many(vec![]);
    for arg in args {
        let mut attrs = vec![];
        let text = match arg {
            &TokenTree::Token(_, token::Ident(s)) => s.to_string(),
            _ => {
                return DummyResult::any(sp);
            }
        };
        let cfg_str = token::InternedString::new("cfg");
        let feat_str = token::InternedString::new("feature");
        attrs.push(cx.attribute(sp,
                                cx.meta_list(sp,
                                             cfg_str,
                                             vec![cx.meta_name_value(sp,
                                                                     feat_str,
                                                                     ast::LitKind::Str(token::intern_and_get_ident(text.trim_left_matches("test_")), ast::StrStyle::Cooked))])));
        test_mods.push(P(ast::Item {
            ident: cx.ident_of(text.as_str()),
            attrs: attrs,
            id: ast::DUMMY_NODE_ID,
            node: ast::ItemKind::Mod(
                // === How to include the specified module file here? ===
                ast::Mod {
                    inner: codemap::DUMMY_SP,
                    items: vec![],
                }
            ),
            vis: ast::Visibility::Inherited,
            span: sp,
        }))
    }

    MacEager::items(test_mods)
}

#[plugin_registrar]
pub fn plugin_registrar(reg: &mut Registry) {
    reg.register_macro("choose", choose);
}

(Gist)

like image 450
knight42 Avatar asked Aug 23 '16 07:08

knight42


People also ask

Is rust a good choice of implementation language?

I also think Rust ended up being a good choice of implementation language, especially the powerful enum s and pattern matching. The main downsides of Rust were the long compile times (although apparently comparable to a group that did their compiler in C++), and the fact that sometimes we had to do somewhat more work to satisfy the borrow checker.

What programming language do you prefer to write plugins in?

In this project I prefer to develop, write and compile into plugins in Rust, instead of write plugin in another language (for example use a Lua runtime in Rust and implement functions for it, thus plugin developers have to use Lua as plugin script language).

What's wrong with dynamic linking in rust?

The problem is that the the Rust ABI isn't defined yet, so the compiler is currently only obligated to generate code that works, not code that properly works with other code. And if there are any ABI details specific to dynamic-linking then they probably aren't tested very well yet in comparison to static linking.

What are blanket impls in rust?

A cool Rust feature we make good use of is “blanket impls” which make the logic for handling AST children that are in containers clean and uniform. pub trait Visitable { fn visit(&mut self, v: &mut dyn Visitor) -> VResult; } impl<T: Visitable> Visitable for Vec<T> { fn visit(&mut self, v: &mut dyn Visitor) -> VResult { for t in self { t.visit(v)?;


1 Answers

2016-08-25 update: use libsyntax::parse::new_parser_from_source_str to avoid setting module path manually. new_parser_from_source_str will only locate modules at CWD, which is unexpected.

As pointed out by @Francis, the real path of a module file may be something like foo/mod.rs,and I discovered a function named new_parser_from_source_str, which is able to create a new parser from a source string, in libsyntax::parse, so I decided to ask the compiler to handle this case for me so I have to handle this case manually. The updated code:

fn choose(cx: &mut ExtCtxt, sp: Span, args: &[TokenTree]) -> Box<MacResult + 'static> {
    let mut test_mods = SmallVector::zero();
    let cfg_str = intern("cfg");
    let ftre_str = intern("feature");
    for arg in args {
        let mut attrs = vec![];
        let mod_name = match arg {
            &TokenTree::Token(_, token::Ident(s)) => s.to_string(),
            _ => {
                return DummyResult::any(sp);
            }
        };
        attrs.push(cx.attribute(sp,
                                cx.meta_list(sp,
                                             // simply increase the reference counter
                                             cfg_str.clone(),
                                             vec![cx.meta_name_value(sp,
                                                                     ftre_str.clone(),
                                                                     ast::LitKind::Str(intern(mod_name.trim_left_matches("test_")), ast::StrStyle::Cooked))])));

        let mut mod_path = PathBuf::from(&cx.codemap().span_to_filename(sp));
        let dir = mod_path.parent().expect("no parent directory").to_owned();
        let default_path = dir.join(format!("{}.rs", mod_name.as_str()));
        let secondary_path = dir.join(format!("{}/mod.rs", mod_name.as_str()));
        match (default_path.exists(), secondary_path.exists()) {
            (false, false) => {
                cx.span_err(sp, &format!("file not found for module `{}`", mod_name.as_str()));
                return DummyResult::any(sp);
            }
            (true, true) => {
                cx.span_err(sp, &format!("file for module `{}` found at both {} and {}", mod_name.as_str(), default_path.display(), secondary_path.display()));
                return DummyResult::any(sp);
            }
            (true, false) => mod_path = default_path,
            (false, true) => mod_path = secondary_path,
        }

        test_mods.push(P(ast::Item {
            ident: cx.ident_of(mod_name.as_str()),
            attrs: attrs,
            id: ast::DUMMY_NODE_ID,
            node: ast::ItemKind::Mod(
                ast::Mod {
                    inner: sp,
                    items: expand_include(cx, sp, &mod_path),
                }
            ),
            vis: ast::Visibility::Inherited,
            span: sp,
        }))
    }

    MacEager::items(test_mods)
}

Finally I found out the solution! \o/

The process Rust handles module file is like include!. As a result, I looked at the implementation of the macro include!, which can be found here, and rewrote it to suit my needs:

use ::std::path::Path;
use ::std::path::PathBuf;
use syntax::parse::{self, token};
use syntax::errors::FatalError;
macro_rules! panictry {
    ($e:expr) => ({
        match $e {
            Ok(e) => e,
            Err(mut e) => {
                e.emit();
                panic!(FatalError);
            }
        }
    })
}

pub fn expand_include<'cx>(cx: &'cx mut ExtCtxt, sp: Span, file: &Path) -> Vec<P<ast::Item>> {
    let mut p = parse::new_sub_parser_from_file(cx.parse_sess(), cx.cfg(), file, true, None, sp);
    let mut ret = vec![];
    while p.token != token::Eof {
        match panictry!(p.parse_item()) {
            Some(item) => ret.push(item),
            None => {
                panic!(p.diagnostic().span_fatal(p.span,
                                                 &format!("expected item, found `{}`", p.this_token_to_string())))
            }
        }
    }
    ret
}

To get items from the module file, we must find out the real module path:

let mut mod_path = PathBuf::from(&cx.codemap().span_to_filename(sp));
mod_path.set_file_name(mod_name.as_str());
mod_path.set_extension("rs");

Then we can construct our module node like this:

P(ast::Item {
    ident: cx.ident_of(mod_name.as_str()),
    attrs: attrs,
    id: ast::DUMMY_NODE_ID,
    node: ast::ItemKind::Mod(ast::Mod {
        inner: sp,
        items: expand_include(cx, sp, &mod_path),
    }),
    vis: ast::Visibility::Inherited,
    span: sp,
})

In summary, the plugin should be rewritten as the following:

#![feature(plugin_registrar, rustc_private)]

extern crate syntax;
extern crate rustc_plugin;

use syntax::ast;
use syntax::ptr::P;
use syntax::codemap::Span;
use syntax::parse::{self, token};
use syntax::tokenstream::TokenTree;
use syntax::ext::base::{ExtCtxt, MacResult, DummyResult, MacEager};
use syntax::errors::FatalError;
use syntax::ext::build::AstBuilder;
use rustc_plugin::Registry;
use syntax::util::small_vector::SmallVector;

use ::std::path::Path;
use ::std::path::PathBuf;

macro_rules! panictry {
    ($e:expr) => ({
        match $e {
            Ok(e) => e,
            Err(mut e) => {
                e.emit();
                panic!(FatalError);
            }
        }
    })
}

pub fn expand_include<'cx>(cx: &'cx mut ExtCtxt, sp: Span, file: &Path) -> Vec<P<ast::Item>> {
    let mut p = parse::new_sub_parser_from_file(cx.parse_sess(), cx.cfg(), file, true, None, sp);
    let mut ret = vec![];
    while p.token != token::Eof {
        match panictry!(p.parse_item()) {
            Some(item) => ret.push(item),
            None => {
                panic!(p.diagnostic().span_fatal(p.span,
                                                 &format!("expected item, found `{}`", p.this_token_to_string())))
            }
        }
    }
    ret
}

fn intern(s: &str) -> token::InternedString {
    token::intern_and_get_ident(s)
}

fn choose(cx: &mut ExtCtxt, sp: Span, args: &[TokenTree]) -> Box<MacResult + 'static> {
    let mut test_mods = SmallVector::zero();
    let cfg_str = intern("cfg");
    let feat_str = intern("feature");
    for arg in args {
        let mut attrs = vec![];
        let mod_name = match arg {
            &TokenTree::Token(_, token::Ident(s)) => s.to_string(),
            _ => {
                return DummyResult::any(sp);
            }
        };
        attrs.push(cx.attribute(sp,
                                cx.meta_list(sp,
                                             // simply increase the reference counter
                                             cfg_str.clone(),
                                             vec![cx.meta_name_value(sp,
                                                                     feat_str.clone(),
                                                                     ast::LitKind::Str(intern(mod_name.trim_left_matches("test_")), ast::StrStyle::Cooked))])));

        let mut mod_path = PathBuf::from(&cx.codemap().span_to_filename(sp));
        mod_path.set_file_name(mod_name.as_str());
        mod_path.set_extension("rs");

        test_mods.push(P(ast::Item {
            ident: cx.ident_of(mod_name.as_str()),
            attrs: attrs,
            id: ast::DUMMY_NODE_ID,
            node: ast::ItemKind::Mod(
                ast::Mod {
                    inner: sp,
                    items: expand_include(cx, sp, &mod_path),
                }
            ),
            vis: ast::Visibility::Inherited,
            span: sp,
        }))
    }

    MacEager::items(test_mods)
}

#[plugin_registrar]
pub fn plugin_registrar(reg: &mut Registry) {
    reg.register_macro("choose", choose);
}
like image 77
knight42 Avatar answered Oct 29 '22 04:10

knight42