Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scala macros and the JVM's method size limit

I'm replacing some code generation components in a Java program with Scala macros, and am running into the Java Virtual Machine's limit on the size of the generated byte code for individual methods (64 kilobytes).

For example, suppose we have a large-ish XML file that represents a mapping from integers to integers that we want to use in our program. We want to avoid parsing this file at run time, so we'll write a macro that will do the parsing at compile time and use the contents of the file to create the body of our method:

import scala.language.experimental.macros
import scala.reflect.macros.Context

object BigMethod {
  // For this simplified example we'll just make some data up.
  val mapping = List.tabulate(7000)(i => (i, i + 1))

  def lookup(i: Int): Int = macro lookup_impl
  def lookup_impl(c: Context)(i: c.Expr[Int]): c.Expr[Int] = {
    import c.universe._

    val switch = reify(new scala.annotation.switch).tree
    val cases = mapping map {
      case (k, v) => CaseDef(c.literal(k).tree, EmptyTree, c.literal(v).tree)
    }

    c.Expr(Match(Annotated(switch, i.tree), cases))
  }
}

In this case the compiled method would be just over the size limit, but instead of a nice error saying that, we're given a giant stack trace with a lot of calls to TreePrinter.printSeq and are told that we've slain the compiler.

I have a solution that involves splitting the cases into fixed-sized groups, creating a separate method for each group, and adding a top-level match that dispatches the input value to the appropriate group's method. It works, but it's unpleasant, and I'd prefer not to have to use this approach every time I write a macro where the size of the generated code depends on some external resource.

Is there a cleaner way to tackle this problem? More importantly, is there a way to deal with this kind of compiler error more gracefully? I don't like the idea of a library user getting an unintelligible "That entry seems to have slain the compiler" error message just because some XML file that's being processed by a macro has crossed some (fairly low) size threshhold.

like image 891
Travis Brown Avatar asked Jun 08 '13 22:06

Travis Brown


2 Answers

Imo putting data into .class isn't really a good idea. They are parsed as well, they're just binary. But storing them in JVM may have negative impact on performance of the garbagge collector and JIT compiler.

In your situation, I would pre-compile the XML into a binary file of proper format and parse that. Elligible formats with existing tooling can be e.g. FastRPC or good old DBF. Or maybe pre-fill an ElasticSearch repository if you need quick advanced lookups and searches. Some implementations of the latter may also provide basic indexing which could even leave the parsing out - the app would just read from the respective offset.

like image 62
Ondra Žižka Avatar answered Sep 30 '22 18:09

Ondra Žižka


Since somebody has to say something, I followed the instructions at Importers to try to compile the tree before returning it.

If you give the compiler plenty of stack, it will correctly report the error.

(It didn't seem to know what to do with the switch annotation, left as a future exercise.)

apm@mara:~/tmp/bigmethod$ skalac bigmethod.scala ; skalac -J-Xss2m biguser.scala ; skala bigmethod.Test Error is java.lang.RuntimeException: Method code too large! Error is java.lang.RuntimeException: Method code too large! biguser.scala:5: error: You ask too much of me.   Console println s"5 => ${BigMethod.lookup(5)}"                                            ^ one error found 

as opposed to

apm@mara:~/tmp/bigmethod$ skalac -J-Xss1m biguser.scala  Error is java.lang.StackOverflowError Error is java.lang.StackOverflowError biguser.scala:5: error: You ask too much of me.   Console println s"5 => ${BigMethod.lookup(5)}"                                            ^ 

where the client code is just that:

package bigmethod  object Test extends App {   Console println s"5 => ${BigMethod.lookup(5)}" } 

My first time using this API, but not my last. Thanks for getting me kickstarted.

package bigmethod  import scala.language.experimental.macros import scala.reflect.macros.Context  object BigMethod {   // For this simplified example we'll just make some data up.   //final val size = 700   final val size = 7000   val mapping = List.tabulate(size)(i => (i, i + 1))    def lookup(i: Int): Int = macro lookup_impl   def lookup_impl(c: Context)(i: c.Expr[Int]): c.Expr[Int] = {      def compilable[T](x: c.Expr[T]): Boolean = {       import scala.reflect.runtime.{ universe => ru }       import scala.tools.reflect._       //val mirror = ru.runtimeMirror(c.libraryClassLoader)       val mirror = ru.runtimeMirror(getClass.getClassLoader)       val toolbox = mirror.mkToolBox()       val importer0 = ru.mkImporter(c.universe)       type ruImporter = ru.Importer { val from: c.universe.type }       val importer = importer0.asInstanceOf[ruImporter]       val imported = importer.importTree(x.tree)       val tree = toolbox.resetAllAttrs(imported.duplicate)       try {         toolbox.compile(tree)         true       } catch {         case t: Throwable =>           Console println s"Error is $t"           false       }     }     import c.universe._      val switch = reify(new scala.annotation.switch).tree     val cases = mapping map {       case (k, v) => CaseDef(c.literal(k).tree, EmptyTree, c.literal(v).tree)     }      //val res = c.Expr(Match(Annotated(switch, i.tree), cases))     val res = c.Expr(Match(i.tree, cases))      // before returning a potentially huge tree, try compiling it     //import scala.tools.reflect._     //val x = c.Expr[Int](c.resetAllAttrs(res.tree.duplicate))     //val y = c.eval(x)     if (!compilable(res)) c.abort(c.enclosingPosition, "You ask too much of me.")      res   } } 
like image 41
som-snytt Avatar answered Sep 30 '22 17:09

som-snytt