Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Building a "Semi-Natural Language" DSL in Ruby

I'm interested in building a DSL in Ruby for use in parsing microblog updates. Specifically, I thought that I could translate text into a Ruby string in the same way as the Rails gem allows "4.days.ago". I already have regex code that will translate the text

@USER_A: give X points to @USER_B for accomplishing some task
@USER_B: take Y points from @USER_A for not giving me enough points

into something like

Scorekeeper.new.give(x).to("USER_B").for("accomplishing some task").giver("USER_A")
Scorekeeper.new.take(x).from("USER_A").for("not giving me enough points").giver("USER_B")

It's acceptable to me to formalize the syntax of the updates so that only standardized text is provided and parsed, allowing me to smartly process updates. Thus, it seems it's more a question of how to implement the DSL class. I have the following stub class (removed all error checking and replaced some with comments to minimize paste):

class Scorekeeper

  attr_accessor :score, :user, :reason, :sender

  def give(num)
    # Can 'give 4' or can 'give a -5'; ensure 'to' called
    self.score = num
    self
  end

  def take(num)
    # ensure negative and 'from' called
    self.score = num < 0 ? num : num * -1
    self
  end

  def plus
    self.score > 0
  end

  def to (str)
    self.user = str
    self
  end

  def from(str)
    self.user = str
    self
  end

  def for(str)
    self.reason = str
    self
  end

  def giver(str)
    self.sender = str
    self
  end

  def command
    str = plus ? "giving @#{user} #{score} points" : "taking #{score * -1} points from @#{user}"
    "@#{sender} is #{str} for #{reason}"
  end

end

Running the following commands:

t = eval('Scorekeeper.new.take(4).from("USER_A").for("not giving me enough points").giver("USER_B")')
p t.command
p t.inspect

Yields the expected results:

"@USER_B is taking 4 points from @USER_A for not giving me enough points"
"#<Scorekeeper:0x100152010 @reason=\"not giving me enough points\", @user=\"USER_A\", @score=4, @sender=\"USER_B\">"

So my question is mainly, am I doing anything to shoot myself in the foot by building upon this implementation? Does anyone have any examples for improvement in the DSL class itself or any warnings for me?

BTW, to get the eval string, I'm mostly using sub/gsub and regex, I figured that's the easiest way, but I could be wrong.

like image 583
JohnMetta Avatar asked Feb 08 '10 19:02

JohnMetta


2 Answers

Am I understanding you correctly: you want to take a string from a user and cause it to trigger some behavior?

Based on the two examples you listed, you probably can get by with using regular expressions.

For example, to parse this example:

@USER_A: give X points to @USER_B for accomplishing some task

With Ruby:

input = "@abe: give 2 points to @bob for writing clean code"
PATTERN = /^@(.+?): give ([0-9]+) points to @(.+?) for (.+?)$/
input =~ PATTERN
user_a = $~[1] # => "abe"
x      = $~[2] # => "2"
user_b = $~[3] # => "bob"
why    = $~[4] # => "writing clean code"

But if there is more complexity, at some point you might find it easier and more maintainable to use a real parser. If you want a parser that works well with Ruby, I recommend Treetop: http://treetop.rubyforge.org/

The idea of taking a string and converting it to code to be evaled makes me nervous. Using eval is a big risk and should be avoided if possible. There are other ways to accomplish your goal. I'll be happy to give some ideas if you want.

A question about the DSL you suggest: are you going to use it natively in another part of your application? Or do just plan on using it as part of the process to convert the string into the behavior you want? I'm not sure what is best without knowing more, but you may not need the DSL if you are just parsing the strings.

like image 69
David J. Avatar answered Sep 27 '22 17:09

David J.


This echoes some of my thoughts on a tangental project (an old-style text MOO).

I'm not convinced that a compiler-style parser is going to be the best way for the program to deal with the vaguaries of english text. My current thoughts have me splitting up the understanding of english into seperate objects -- so a box understands "open box" but not "press button", etc. -- and then having the objects use some sort of DSL to call centralised code that actually makes things happen.

I'm not sure that you've got to the point where you understand how the DSL is actually going to help you. Maybe you need to look at how the english text gets turned into DSL, first. I'm not saying that you don't need a DSL; you might very well be right.

As for hints as to how to do that? Well, I think if I were you I would be looking for specific verbs. Each verb would "know" what sort of thing it should expect from the text around it. So in your example "to" and "from" would expect a user immediately following.

This isn't especially divergent from the code you've posted here, IMO.

You might get some milage out of looking at the answers to my question. One commenter pointed me to the Interpreter Pattern, which I found especially enlightening: there's a nice Ruby example here.

like image 31
Shadowfirebird Avatar answered Sep 27 '22 17:09

Shadowfirebird