ANTLR V4 + Java8 Grammar - OutOfMemoryException

Question

I'm trying to use ANTLR V4 with the publicly given Java 8 grammar - https://github.com/antlr/grammars-v4/blob/master/java8/Java8.g4

I generated the class files and tried to parse the Java 8 JRE, but somehow at java.text.SimpleDateFormat.java it crashes with:

java.lang.OutOfMemoryError: GC overhead limit exceeded

It also crashes, when I'm trying to parse that single file alone.

Can this be solved somehow? Obviously ANTLR V4 can't handle files with more than 2000 LOC? Is that a correct assumption?

What I've done so far:

Changing assigned memory to JVM in multiple steps from 256MB up to 4GB - it then changes to

java.lang.OutOfMemoryError: Java heap space
To ensure that there is no syntactical problem with the input-file
At first I removed the first half of the file -> parsing seems okay,
then undid that action and removed the second half of the file -> parsing seems okay

Sam Harwell · Accepted Answer

It looks like the grammar in that repository is based on one I wrote. The grammar relies on certain functionality which is only available in my "optimized" fork of ANTLR 4 in order to perform well. In addition to using that release, you'll need to do the following two things to maximize performance:

Use the two-stage parsing strategy. Assuming your start rule is called compilationUnit, it might look like the following:

CompilationUnitContext compilationUnit;
try {
  // Stage 1: High-speed parsing for correct documents

  parser.setErrorHandler(new BailErrorStrategy());
  parser.getInterpreter().setPredictionMode(PredictionMode.SLL);
  parser.getInterpreter().tail_call_preserves_sll = false;
  compilationUnit = parser.compilationUnit();
} catch (ParseCancellationException e) {
  // Stage 2: High-accuracy fallback parsing for complex and/or erroneous documents

  // TODO: reset your input stream
  parser.setErrorHandler(new DefaultErrorStrategy());
  parser.getInterpreter().setPredictionMode(PredictionMode.LL);
  parser.getInterpreter().tail_call_preserves_sll = false;
  parser.getInterpreter().enable_global_context_dfa = true;
  compilationUnit = parser.compilationUnit();
}

Enable the global context DFA (I included this in the previous code block so you can't miss it)
```
parser.getInterpreter().enable_global_context_dfa = true;
```

ANTLR V4 + Java8 Grammar -> OutOfMemoryException

Tags:

java

parsing

java-8

antlr

antlr4

Ronald Duck

1 Answers

Sam Harwell

Recent Activity

Donate For Us