So here's what I'm looking to achieve. I would like to give my users a single google-like textbox where they can type their queries. And I would like them to be able to express semi-natural language such as
"view all between 1/1/2008 and 1/2/2008"
it's ok if the syntax has to be fairly structured and limited to this specific domain ... these are expert users who will be using this.
Ultimately, I think I'd like the parse results to be available as some sort of expression tree. But if you've got some other ideas about what data structure might be better.
This is in C# :-)
You are describing a programming language. Granted it's a small language (often called a little language, or Domain Specific Language (DSL)). If you've never heard the term recursive descent parser, you are probably better off following Paul's advice and using drop down boxes of some description.
However, again, I would have to agree with him, that if you want to do it, Antlr is the way to go. There are tutorials on the site that might help you get started. Basically, you will need to describe how the syntax using Backus-Naur Form notation.
You will then run Antlr over your grammer, and it will generate your parser. You can then feed the input from your textbook into an Abstract Syntax Tree. You can then use that Tree to generate your query. It's not as difficult as it all sounds, but there's a bit to it.
If you're really into this and/or want to stretch your programming wings a bit, you could read more on the topic with the Dragon Book, AKA Compilers: Principles, Techniques and Tools.
Good luck my friend.
For a very simple language, I'd go with regexps. Main benefit there is you don't have to deal with any code generation. Debugging of the pattern matching is basically nil, though.
If your language is moderately complex (you wouldn't mind specifying the entire thing in a single grammar file), I'd go with Coco/R -- it's fast, easy to use, and makes extremely debuggable code.
For a more complex language, my current favorite is Antlr v3. Supports multi-file grammars (via the 'import' statement), which is very nice. The generated code is debuggable, but takes a bit of getting used to before debugging could be considered 'easy.'
I have been in this situation before. After much discussion, we decided that context-sensitive dropdowns were a better solution than just a textbox.
For instance, have 4 dropdowns, but the last 3 are disabled. Then, when the user selects an option from the first, it populates and enables the others. So they would select "view all" then "between", and then maybe pop a textbox or a calendar for the last two.
Here's an example that's kind of like what I am talking about
An expression tree is a good idea. There are many good general parsers and parser generators out there, open-source as well as commercial, which can transform correct query strings into expression trees.
Use Oslo, it's designed specifically for this...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With