ANTLR: call a rule from a different grammar

Tags:

is it possible to invoke a rule from a different grammar?
the purpose is to have two languages in the same file, the second language starting by an (begin ...) where ... is in the second language. the grammar should invoke another grammar to parse that second language.

for example:


grammar A;

start_rule
    :    '(' 'begin' B.program ')' //or something like that
    ;


grammar B;

program
    :   something* EOF
    ;

something
    : ...
    ;

322

asked Jul 11 '11 14:07

zhujik

1 Answers

Your question could be interpreted in (at least) two ways:

separate rules from a large grammar into separate grammars;
parse a separate language inside your "main" language (island grammar).

I assume it's the first, in which case you can import grammars.

A demo for option 1:

file: L.g

lexer grammar L;

Digit
  :  '0'..'9'
  ;

file: Sub.g

parser grammar Sub;

number
  :  Digit+
  ;

file: Root.g

grammar Root;

import Sub;

parse
  :  number EOF {System.out.println("Parsed: " + $number.text);}
  ;

file: Main.java

import org.antlr.runtime.*;

public class Main {
  public static void main(String[] args) throws Exception {
    L lexer = new L(new ANTLRStringStream("42"));
    CommonTokenStream tokens = new CommonTokenStream(lexer);
    RootParser parser = new RootParser(tokens);
    parser.parse();
  }
}

Run the demo:

bart@hades:~/Programming/ANTLR/Demos/Composite$ java -cp antlr-3.3.jar org.antlr.Tool L.g
bart@hades:~/Programming/ANTLR/Demos/Composite$ java -cp antlr-3.3.jar org.antlr.Tool Root.g 
bart@hades:~/Programming/ANTLR/Demos/Composite$ javac -cp antlr-3.3.jar *.java
bart@hades:~/Programming/ANTLR/Demos/Composite$ java -cp .:antlr-3.3.jar Main

which will print:

Parsed: 42

to the console.

More info, see: http://www.antlr.org/wiki/display/ANTLR3/Composite+Grammars

A demo for option 2:

A nice example of a language inside a language is regex. You have the "normal" regex language with its meta characters, but there's another one in it: the language that describes a character set (or character class).

Instead of accounting for the meta characters of a character set (range -, negation ^, etc.) inside your regex-grammar, you could simply consider a character set as a single token consisting of a [ and then everything up to and including ] (with possibly \] in it!) inside your regex-grammar. When you then stumble upon a CharSet token in one of your parser rules, you invoke the CharSet-parser.

file: Regex.g

grammar Regex;

options { 
  output=AST;
}

tokens {
  REGEX;
  ATOM;
  CHARSET;
  INT;
  GROUP;
  CONTENTS;
}

@members {
  public static CommonTree ast(String source) throws RecognitionException {
    RegexLexer lexer = new RegexLexer(new ANTLRStringStream(source));
    RegexParser parser = new RegexParser(new CommonTokenStream(lexer));
    return (CommonTree)parser.parse().getTree();
  }
}

parse
  :  atom+ EOF -> ^(REGEX atom+)
  ;

atom
  :  group quantifier?     -> ^(ATOM group quantifier?)
  |  EscapeSeq quantifier? -> ^(ATOM EscapeSeq quantifier?)
  |  Other quantifier?     -> ^(ATOM Other quantifier?)
  |  CharSet quantifier?   -> ^(CHARSET {CharSetParser.ast($CharSet.text)} quantifier?)
  ;

group
  :  '(' atom+ ')' -> ^(GROUP atom+)
  ;

quantifier
  :  '+'
  |  '*'
  ;

CharSet
  :  '[' (('\\' .) | ~('\\' | ']'))+ ']'
  ;

EscapeSeq
  :  '\\' .
  ;

Other
  :  ~('\\' | '(' | ')' | '[' | ']' | '+' | '*')
  ;

file: CharSet.g

grammar CharSet;

options { 
  output=AST;
}

tokens {
  NORMAL_CHAR_SET;
  NEGATED_CHAR_SET;
  RANGE;
}

@members {
  public static CommonTree ast(String source) throws RecognitionException {
    CharSetLexer lexer = new CharSetLexer(new ANTLRStringStream(source));
    CharSetParser parser = new CharSetParser(new CommonTokenStream(lexer));
    return (CommonTree)parser.parse().getTree();
  }
}

parse
  :  OSqBr ( normal  -> ^(NORMAL_CHAR_SET normal)
           | negated -> ^(NEGATED_CHAR_SET negated)
           ) 
     CSqBr
  ;

normal
  :  (EscapeSeq | Hyphen | Other) atom* Hyphen?
  ;

negated
  :  Caret normal -> normal
  ;

atom
  :  EscapeSeq
  |  Caret
  |  Other
  |  range
  ;

range
  :  from=Other Hyphen to=Other -> ^(RANGE $from $to)
  ;

OSqBr
      :  '['
  ;

CSqBr
  :  ']'
  ;

EscapeSeq
  :  '\\' .
  ;

Caret
  :  '^'
  ;

Hyphen
  :  '-'
  ;

Other
  :  ~('-' | '\\' | '[' | ']')
  ;

file: Main.java

import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;

public class Main {
  public static void main(String[] args) throws Exception {
    CommonTree tree = RegexParser.ast("((xyz)*[^\\da-f])foo");
    DOTTreeGenerator gen = new DOTTreeGenerator();
    StringTemplate st = gen.toDOT(tree);
    System.out.println(st);
  }
}

And if you run the main class, you will see the DOT output for the regex ((xyz)*[^\\da-f])foo which is the following tree:

enter image description here

The magic is inside the Regex.g grammar in the atom rule where I inserted a tree node in a rewrite rule by invoking the static ast method from the CharSetParser class:

CharSet ... -> ^(... {CharSetParser.ast($CharSet.text)} ...)

Note that inside such rewrite rules, there must not be a semi colon! So, this would be wrong: {CharSetParser.ast($CharSet.text);}.

EDIT

And here's how to create tree walkers for both grammars:

file: RegexWalker.g

tree grammar RegexWalker;

options {
  tokenVocab=Regex;
  ASTLabelType=CommonTree;
}

walk
  :  ^(REGEX atom+) {System.out.println("REGEX: " + $start.toStringTree());}
  ;

atom
  :  ^(ATOM group quantifier?)
  |  ^(ATOM EscapeSeq quantifier?)
  |  ^(ATOM Other quantifier?)
  |  ^(CHARSET t=. quantifier?) {CharSetWalker.walk($t);}
  ;

group
  :  ^(GROUP atom+)
  ;

quantifier
  :  '+'
  |  '*'
  ;

file: CharSetWalker.g

tree grammar CharSetWalker;

options {
  tokenVocab=CharSet;
  ASTLabelType=CommonTree;
}

@members {
  public static void walk(CommonTree tree) {
    try {
      CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
      CharSetWalker walker = new CharSetWalker(nodes);
      walker.walk();
    } catch(Exception e) {
      e.printStackTrace();
    }
  }
}

walk
  :  ^(NORMAL_CHAR_SET normal)  {System.out.println("NORMAL_CHAR_SET: " + $start.toStringTree());}
  |  ^(NEGATED_CHAR_SET normal) {System.out.println("NEGATED_CHAR_SET: " + $start.toStringTree());}
  ;

normal
  :  (EscapeSeq | Hyphen | Other) atom* Hyphen?
  ;

atom
  :  EscapeSeq
  |  Caret
  |  Other
  |  range
  ;

range
  :  ^(RANGE Other Other)
  ;

Main.java

import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;

public class Main {
  public static void main(String[] args) throws Exception {
    CommonTree tree = RegexParser.ast("((xyz)*[^\\da-f])foo");
    CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
    RegexWalker walker = new RegexWalker(nodes);
    walker.walk();
  }
}

To run the demo, do:

java -cp antlr-3.3.jar org.antlr.Tool CharSet.g 
java -cp antlr-3.3.jar org.antlr.Tool Regex.g
java -cp antlr-3.3.jar org.antlr.Tool CharSetWalker.g
java -cp antlr-3.3.jar org.antlr.Tool RegexWalker.g 
javac -cp antlr-3.3.jar *.java
java -cp .:antlr-3.3.jar Main

which will print:

NEGATED_CHAR_SET: (NEGATED_CHAR_SET \d (RANGE a f))
REGEX: (REGEX (ATOM (GROUP (ATOM (GROUP (ATOM x) (ATOM y) (ATOM z)) *) (CHARSET (NEGATED_CHAR_SET \d (RANGE a f))))) (ATOM f) (ATOM o) (ATOM o))

144

answered Sep 27 '22 23:09

Bart Kiers

Related questions
                            
                                ANTLR “Cannot launch the debugger. Time-out waiting to connect to the remote parser.”
                            
                                Antlr generated classes access modifier to internal
                            
                                Converting Abstract Syntax Tree to Byte code
                            
                                How do I get an Antlr Parser rule to read from both default AND hidden channel
                            
                                Systematic way to generate ANTLR tree grammar?
                            
                                Proper way to resolve ANTLR lexer rule ambiguities?
                            
                                How do I list all local variables within a Java method / function?
                            
                                Syntactic predicates in ANTLR lexer rules
                            
                                Is there a valid alternative to ANTLR written in C#? [closed]
                            
                                if then else conditional evaluation
                            
                                "FOLLOW_set_in_"... is undefined in generated parser
                            
                                Returning multiple values in ANTLR rule
                            
                                Can I use antlr to parse partial data?
                            
                                ANTLR Syntax Highlighting DSL in Visual Studio
                            
                                antlr 4 - warning: rule contains an optional block with at least one alternative that can match an empty string
                            
                                Can I use an Antlr created lexer/parser to parse PDDL file and return data to a Java program?
                            
                                Switching Antlr lexer modes from parser
                            
                                Lexer/parser tools [closed]
                            
                                Antlr4 C++ target
                            
                                Help with left factoring a grammar to remove left recursion

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

ANTLR: call a rule from a different grammar

Tags:

grammar

antlr

modularity

rule

zhujik

People also ask

1 Answers

A demo for option 1:

file: L.g

file: Sub.g

file: Root.g

file: Main.java

Run the demo:

A demo for option 2:

file: Regex.g

file: CharSet.g

file: Main.java

EDIT

file: RegexWalker.g

file: CharSetWalker.g

Main.java

Bart Kiers

Recent Activity

Donate For Us