Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ANTLR V4 + Java8 Grammar -> OutOfMemoryException

I'm trying to use ANTLR V4 with the publicly given Java 8 grammar - https://github.com/antlr/grammars-v4/blob/master/java8/Java8.g4

I generated the class files and tried to parse the Java 8 JRE, but somehow at java.text.SimpleDateFormat.java it crashes with:

java.lang.OutOfMemoryError: GC overhead limit exceeded

It also crashes, when I'm trying to parse that single file alone.

Can this be solved somehow? Obviously ANTLR V4 can't handle files with more than 2000 LOC? Is that a correct assumption?

What I've done so far:

  • Changing assigned memory to JVM in multiple steps from 256MB up to 4GB - it then changes to

    java.lang.OutOfMemoryError: Java heap space

  • To ensure that there is no syntactical problem with the input-file
    At first I removed the first half of the file -> parsing seems okay,
    then undid that action and removed the second half of the file -> parsing seems okay

like image 879
Ronald Duck Avatar asked Oct 02 '15 14:10

Ronald Duck


1 Answers

It looks like the grammar in that repository is based on one I wrote. The grammar relies on certain functionality which is only available in my "optimized" fork of ANTLR 4 in order to perform well. In addition to using that release, you'll need to do the following two things to maximize performance:

  1. Use the two-stage parsing strategy. Assuming your start rule is called compilationUnit, it might look like the following:

    CompilationUnitContext compilationUnit;
    try {
      // Stage 1: High-speed parsing for correct documents
    
      parser.setErrorHandler(new BailErrorStrategy());
      parser.getInterpreter().setPredictionMode(PredictionMode.SLL);
      parser.getInterpreter().tail_call_preserves_sll = false;
      compilationUnit = parser.compilationUnit();
    } catch (ParseCancellationException e) {
      // Stage 2: High-accuracy fallback parsing for complex and/or erroneous documents
    
      // TODO: reset your input stream
      parser.setErrorHandler(new DefaultErrorStrategy());
      parser.getInterpreter().setPredictionMode(PredictionMode.LL);
      parser.getInterpreter().tail_call_preserves_sll = false;
      parser.getInterpreter().enable_global_context_dfa = true;
      compilationUnit = parser.compilationUnit();
    }
    
  2. Enable the global context DFA (I included this in the previous code block so you can't miss it)

    parser.getInterpreter().enable_global_context_dfa = true;
    
like image 149
Sam Harwell Avatar answered Sep 27 '22 22:09

Sam Harwell