Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting ANTLR to generate a script interpreter?

Say I have the following Java API that all packages up as blocks.jar:

public class Block {
    private Sting name;
    private int xCoord;
    private int yCoord;

    // Getters, setters, ctors, etc.

    public void setCoords(int x, int y) {
        setXCoord(x);
        setYCoord(y);
    }
}

public BlockController {
    public static moveBlock(Block block, int newXCoord, int newYCoord) {
        block.setCooords(newXCoord, newYCoord);
    }

    public static stackBlocks(Block under, Block onTop) {
        // Stack "onTop" on top of "under".
        // Don't worry about the math here, this is just for an example.
        onTop.setCoords(under.getXCoord() + onTop.getXCoord(), under.getYCoord());
    }
}

Again, don't worry about the math and the fact that (x,y) coordinates don't accurately represent blocks in 3D space. The point is that we have Java code, compiled as a JAR, that performs operations on blocks. I now want to build a lightweight scripting language that allows a non-programmer to invoke the various block API methods and manipulate blocks, and I want to implement its interpreter with ANTLR (latest version is 4.3).

The scripting language, we'll call it BlockSpeak, might look like this:

block A at (0, 10)   # Create block "A" at coordinates (0, 10)
block B at (0, 20)   # Create block "B" at coordinates (0, 20)
stack A on B         # Stack block A on top of block B

This might be equivalent to the following Java code:

Block A, B;
A = new Block(0, 10);
B = new Block(0, 20);
BlockController.stackBlocks(B, A);

So the idea is that the ANTLR-generated interpreter would take a *.blockspeak script as input, and use the commands in this script to invoke blocks.jar API operations. I read the excellent Simple Example which creates a simple calculator using ANTLR. However in that link, there is an ExpParser class with an eval() method:

ExpParser parser = new ExpParser(tokens);
parser.eval();

The problem here is that, in the case of the calculator, the tokens represent a mathematical expression to evaluate, and eval() returns the evaluation of the expression. In the case of an interpreter, the tokens would represent my BlockSpeak script, but calling eval() shouldn't evaluate anything, it should know how to map the various BlockSpeak commands to Java code:

BlockSpeak Command:             Java code:
==========================================
block A at (0, 10)      ==>     Block A = new Block(0, 10);
block B at (0, 20)      ==>     Block B = new Block(0, 20);
stack A on B            ==>     BlockController.stackBlocks(B, A);

So my question is, where do I perform this "mapping"? In other words, how do I instruct ANTLR to call various pieces of code (packaged inside blocks.jar) when it encounters particular grammars in the BlockSpeak script? More importantly, can someone give me a pseudo-code example?

like image 954
IAmYourFaja Avatar asked Jul 15 '14 18:07

IAmYourFaja


People also ask

What does ANTLR generate?

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

What can you do with ANTLR?

ANTLR is a powerful parser generator that you can use to read, process, execute, or translate structured text or binary files. It's widely used in academia and industry to build all sorts of languages, tools, and frameworks. Twitter search uses ANTLR for query parsing, with over 2 billion queries a day.

Is ANTLR a compiler?

In computer-based language recognition, ANTLR (pronounced antler), or ANother Tool for Language Recognition, is a parser generator that uses LL(*) for parsing. ANTLR is the successor to the Purdue Compiler Construction Tool Set (PCCTS), first developed in 1989, and is under active development.


1 Answers

I would simply evaluate the script on the fly, not generate Java source files which need to be compiled themselves again.

With ANTLR 4 it is highly recommended to keep the grammar and target specific code separate from each other and put any target specific code inside a tree-listener or -visitor.

I will give a quick demo how to use a listener.

A grammar for your example input could look like this:

File: blockspeak/BlockSpeak.g4

grammar BlockSpeak;

parse
 : instruction* EOF
 ;

instruction
 : create_block
 | stack_block
 ;

create_block
 : 'block' NAME 'at' position
 ;

stack_block
 : 'stack' top=NAME 'on' bottom=NAME
 ;

position
 : '(' x=INT ',' y=INT ')'
 ;

COMMENT
 : '#' ~[\r\n]* -> skip
 ;

INT
 : [0-9]+
 ;

NAME
 : [a-zA-Z]+
 ;

SPACES
 : [ \t\r\n] -> skip
 ;

Some supporting Java classes:

File: blockspeak/Main.java

package blockspeak;

import org.antlr.v4.runtime.ANTLRInputStream;
import org.antlr.v4.runtime.CommonTokenStream;
import org.antlr.v4.runtime.tree.ParseTreeWalker;

import java.util.Scanner;

public class Main {

    public static void main(String[] args) throws Exception {

        Scanner keyboard = new Scanner(System.in);

        // Some initial input to let the parser have a go at.
        String input = "block A at (0, 10)   # Create block \"A\" at coordinates (0, 10)\n" +
                "block B at (0, 20)   # Create block \"B\" at coordinates (0, 20)\n" +
                "stack A on B         # Stack block A on top of block B";

        EvalBlockSpeakListener listener = new EvalBlockSpeakListener();

        // Keep asking for input until the user presses 'q'.
        while(!input.equals("q")) {

            // Create a lexer and parser for `input`.
            BlockSpeakLexer lexer = new BlockSpeakLexer(new ANTLRInputStream(input));
            BlockSpeakParser parser = new BlockSpeakParser(new CommonTokenStream(lexer));

            // Now parse the `input` and attach our listener to it. We want to reuse 
            // the same listener because it will hold out Blocks-map.
            ParseTreeWalker.DEFAULT.walk(listener, parser.parse());

            // Let's see if the user wants to continue.
            System.out.print("Type a command and press return (q to quit) $ ");
            input = keyboard.nextLine();
        }

        System.out.println("Bye!");
    }
}

// You can place this Block class inside Main.java as well.
class Block {

    final String name;
    int x;
    int y;

    Block(String name, int x, int y) {
        this.name = name;
        this.x = x;
        this.y = y;
    }

    void onTopOf(Block that) {
        // TODO
    }
}

This main class is pretty self explanatory with the inline comments. The tricky part is what the listener is supposed to look like. Well, here it is:

File: blockspeak/EvalBlockSpeakListener.java

package blockspeak;

import org.antlr.v4.runtime.misc.NotNull;

import java.util.HashMap;
import java.util.Map;

/**
 * A class extending the `BlockSpeakBaseListener` (which will be generated
 * by ANTLR) in which we override the methods in which to create blocks, and
 * in which to stack blocks.
 */
public class EvalBlockSpeakListener extends BlockSpeakBaseListener {

    // A map that keeps track of our Blocks.
    private final Map<String, Block> blocks = new HashMap<String, Block>();

    @Override
    public void enterCreate_block(@NotNull BlockSpeakParser.Create_blockContext ctx) {

        String name = ctx.NAME().getText();
        Integer x = Integer.valueOf(ctx.position().x.getText());
        Integer y = Integer.valueOf(ctx.position().y.getText());

        Block block = new Block(name, x, y);

        System.out.printf("creating block: %s\n", name);

        blocks.put(block.name, block);
    }

    @Override
    public void enterStack_block(@NotNull BlockSpeakParser.Stack_blockContext ctx) {

        Block bottom = this.blocks.get(ctx.bottom.getText());
        Block top = this.blocks.get(ctx.top.getText());

        if (bottom == null) {
            System.out.printf("no such block: %s\n", ctx.bottom.getText());
        }
        else if (top == null) {
            System.out.printf("no such block: %s\n", ctx.top.getText());
        }
        else {
            System.out.printf("putting %s on top of %s\n", top.name, bottom.name);
            top.onTopOf(bottom);
        }
    }
}

The listener above has 2 methods defined that map to the following parser rules:

create_block
 : 'block' NAME 'at' position
 ;

stack_block
 : 'stack' top=NAME 'on' bottom=NAME
 ;

Whenever the parser "enters" such a parser rule, the corresponding method inside the listener will be called. So, whenever enterCreate_block (the parser enters the create_block rule) is called, we create (and save) a block, and when enterStack_block is called, we retrieve the 2 block involved in the operation, and stack one on top of the other.

To see the 3 classes above in action, download ANTLR 4.4 inside the directory that holds the blockspeak/ directory with the .g4 and .java files.

Open a console and perform the following 3 steps:

1. generate the ANTLR files:

java -cp antlr-4.4-complete.jar org.antlr.v4.Tool blockspeak/BlockSpeak.g4 -package blockspeak

2. compile all Java sources files:

javac -cp ./antlr-4.4-complete.jar blockspeak/*.java

3. Run the main class:

3.1. Linux/Mac
java -cp .:antlr-4.4-complete.jar blockspeak.Main
3.2. Windows
java -cp .;antlr-4.4-complete.jar blockspeak.Main

Here is an example session of running the Main class:

bart@hades:~/Temp/demo$ java -cp .:antlr-4.4-complete.jar blockspeak.Main
creating block: A
creating block: B
putting A on top of B
Type a command and press return (q to quit) $ block X at (0,0)
creating block: X
Type a command and press return (q to quit) $ stack Y on X
no such block: Y
Type a command and press return (q to quit) $ stack A on X 
putting A on top of X
Type a command and press return (q to quit) $ q
Bye!
bart@hades:~/Temp/demo$ 

More info on tree listeners: https://theantlrguy.atlassian.net/wiki/display/ANTLR4/Parse+Tree+Listeners

like image 57
Bart Kiers Avatar answered Sep 29 '22 11:09

Bart Kiers