Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I reuse definition (AST) subtrees in a macro?

I am working in a Scala embedded DSL and macros are becoming a main tool for achieving my purposes. I am getting an error while trying to reuse a subtree from the incoming macro expression into the resulting one. The situation is quite complex, but (I hope) I have simplified it for its understanding.

Suppose we have this code:

val y = transform {
  val x = 3
  x
}
println(y) // prints 3

where 'transform' is the involved macro. Although it could seem it does absolutely nothing, it is really transforming the shown block into this expression:

3 match { case x => x }

It is done with this macro implementation:

def transform(c: Context)(block: c.Expr[Int]): c.Expr[Int] = {
  import c.universe._
  import definitions._

  block.tree match {
    /* {
     *   val xNam = xVal
     *   xExp
     * }
     */
    case Block(List(ValDef(_, xNam, _, xVal)), xExp) =>
      println("# " + showRaw(xExp)) // prints Ident(newTermName("x"))
      c.Expr(
        Match(
          xVal, 
          List(CaseDef(
            Bind(xNam, Ident(newTermName("_"))),
            EmptyTree,
            /* xExp */ Ident(newTermName("x")) ))))
    case _ => 
      c.error(c.enclosingPosition, "Can't transform block to function")
      block  // keep original expression
  }
}

Notice that xNam corresponds with the variable name, xVal corresponds with its associated value and finally xExp corresponds with the expression containing the variable. Well, if I print the xExp raw tree I get Ident(newTermName("x")), and that is exactly what is set in the case RHS. Since the expression could be modified (for instance x+2 instead of x), this is not a valid solution for me. What I want to do is to reuse the xExp tree (see the xExp comment) while altering the 'x' meaning (it is a definition in the input expression but will be a case LHS variable in the output one), but it launches a long error summarized in:

symbol value x does not exist in org.habla.main.Main$delayedInit$body.apply); see the error output for details.

My current solution consists on the parsing of the xExp to sustitute all the Idents with new ones, but it is totally dependent on the compiler internals, and so, a temporal workaround. It is obvious that the xExp comes along with more information that the offered by showRaw. How can I clean that xExp for allowing 'x' to role the case variable? Can anyone explain the whole picture of this error?

PS: I have been trying unsuccessfully to use the substitute* method family from the TreeApi but I am missing the basics to understand its implications.

like image 857
jeslg Avatar asked Jun 26 '12 13:06

jeslg


1 Answers

Disassembling input expressions and reassembling them in a different fashion is an important scenario in macrology (this is what we do internally in the reify macro). But unfortunately, it's not particularly easy at the moment.

The problem is that input arguments of the macro reach macro implementation being already typechecked. This is both a blessing and a curse.

Of particular interest for us is the fact that variable bindings in the trees corresponding to the arguments are already established. This means that all Ident and Select nodes have their sym fields filled in, pointing to the definitions these nodes refer to.

Here is an example of how symbols work. I'll copy/paste a printout from one of my talks (I don't give a link here, because most of the info in my talks is deprecated by now, but this particular printout has everlasting usefulness):

>cat Foo.scala
def foo[T: TypeTag](x: Any) = x.asInstanceOf[T]
foo[Long](42)

>scalac -Xprint:typer -uniqid Foo.scala
[[syntax trees at end of typer]]// Scala source: Foo.scala
def foo#8339
  [T#8340 >: Nothing#4658 <: Any#4657]
  (x#9529: Any#4657)
  (implicit evidence$1#9530: TypeTag#7861[T#8341])
  : T#8340 =
x#9529.asInstanceOf#6023[T#8341];
Test#14.this.foo#8339[Long#1641](42)(scala#29.reflect#2514.`package`#3414.mirror#3463.TypeTag#10351.Long#10361)

To recap, we write a small snippet and then compile it with scalac, asking the compiler to dump the trees after the typer phase, printing unique ids of the symbols assigned to trees (if any).

In the resulting printout we can see that identifiers have been linked to corresponding definitions. For example, on the one hand, the ValDef("x", ...), which represents the parameter of the method foo, defines a method symbol with id=9529. On the other hand, the Ident("x") in the body of the method got its sym field set to the same symbol, which establishes the binding.

Okay, we've seen how bindings work in scalac, and now is the perfect time to introduce a fundamental fact.

If a symbol has been assigned to an AST node, 
then subsequent typechecks will never reassign it. 

This is why reify is hygienic. You can take a result of reify and insert it into an arbitrary tree (that possibly defines variables with conflicting names) - the original bindings will remain intact. This works because reify preserves the original symbols, so subsequent typechecks won't rebind reified AST nodes.

Now we're all set to explain the error you're facing:

symbol value x does not exist in org.habla.main.Main$delayedInit$body.apply); see the error output for details.

The argument of the transform macro contains both a definition and a reference to a variable x. As we've just learned, this means that the corresponding ValDef and Ident will have their sym fields synchronized. So far, so good.

However unfortunately the macro corrupts the established binding. It recreates the ValDef, but doesn't clean up the sym field of the corresponding Ident. Subsequent typecheck assigns a fresh symbol to the newly created ValDef, but doesn't touch the original Ident that is copied to the result verbatim.

After the typecheck, the original Ident points to a symbol that no longer exists (this is exactly what the error message was saying :)), which leads to a crash during bytecode generation.

So how do we fix the error? Unfortunately there is no easy answer.

One option would be to utilize c.resetLocalAttrs, which recursively erases all symbols in a given AST node. Subsequent typecheck will then reestablish the bindings granted that the code you generated doesn't mess with them (if, for example, you wrap xExp in a block that itself defines a value named x, then you're in trouble).

Another option is to fiddle with symbols. For example, you could write your own resetLocalAttrs that only erases corrupted bindings and doesn't touch the valid ones. You could also try and assign symbols by yourself, but that's a short road to madness, though sometimes one is forced to walk it.

Not cool at all, I agree. We're aware of that and intend to try and fix this fundamental issue sometimes. However right now our hands are full with bugfixing before the final 2.10.0 release, so we won't be able to address the problem in the nearest future. upd. See https://groups.google.com/forum/#!topic/scala-internals/rIyJ4yHdPDU for some additional information.


Bottom line. Bad things happen, because bindings get messed up. Try resetLocalAttrs first, and if it doesn't work, prepare yourself for a chore.

like image 186
Eugene Burmako Avatar answered Sep 30 '22 06:09

Eugene Burmako