Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What would the best tool to create a natural DSL in Java? [closed]

A couple of days ago, I read a blog entry (http://ayende.com/Blog/archive/2008/09/08/Implementing-generic-natural-language-DSL.aspx) where the author discuss the idea of a generic natural language DSL parser using .NET.

The brilliant part of his idea, in my opinion, is that the text is parsed and matched against classes using the same name as the sentences.

Taking as an example, the following lines:

Create user user1 with email [email protected] and password test
Log user1 in
Take user1 to category t-shirts
Make user1 add item Flower T-Shirt to cart
Take user1 to checkout

Would get converted using a collection of "known" objects, that takes the result of parsing. Some example objects would be (using Java for my example):

public class CreateUser {
    private final String user;
    private String email;
    private String password;

    public CreateUser(String user) {
    this.user = user;
    }

    public void withEmail(String email) {
    this.email = email;
    }

    public String andPassword(String password) {
        this.password = password;
    }
}

So, when processing the first sentence, CreateUser class would be a match (obviously because it's a concatenation of "create user") and, since it takes a parameter on the constructor, the parser would take "user1" as being the user parameter.

After that, the parser would identify that the next part, "with email" also matches a method name, and since that method takes a parameter, it would parse "[email protected]" as being the email parameter.

I think you get the idea by now, right? One quite clear application of that, at least for me, would be to allow application testers create "testing scripts" in natural language and then parse the sentences into classes that uses JUnit to check for app behaviors.

I'd like to hear ideas, tips and opinions on tools or resource that could code such parser using Java. Better yet if we could avoid using complex lexers, or frameworks like ANTLR, which I think maybe would be using a hammer to kill a fly.

More than that, if anyone is up to start an open source project for that, I would definitely be interested.

like image 410
kolrie Avatar asked Sep 27 '08 19:09

kolrie


1 Answers

Considering the complexity of lexing and parsing, I don't know if I'd want to code all that by hand. ANTLR isn't that hard to pickup and I think it is worthing looking into based on your problem. If you use a parse grammar to build and abstract syntax tree from the input, its pretty easy to then process that AST with a tree grammar. The tree grammar could easily handle executing the process you described.

You'll find ANTLR in many places including Eclipse, Groovy, and Grails for a start. The Definitive ANTLR Reference even makes it fairly straightforward to get up to speed on the basic fairly quickly.

I had a project that had to handle some user generated query text earlier this year. I started down a path to manually process it, but it quickly became overwhelming. I took a couple days to get up the speed on ANTLR and had an initial version of my grammar and processor running in a few days. Subsequent changes and adjustments to the requirements would have killed any custom version, but required relatively little effort to adjust once I had the ANTLR grammars up and running.

Good luck!

like image 87
Joe Skora Avatar answered Oct 20 '22 19:10

Joe Skora