Derive minimal regular expression from input

Question

I have a remote "agent" that returns "yes" or "no" when handed a string. Communicating with this agent is expensive, so I'm hoping to find a library that will allow me to iteratively build a regular expression given positive and negative feedback, while being intelligent about its construction. This would allow me to cache answers on the sending side.

For example, suppose we query the agent with "good" and receive a "yes". The initial derived regular expression ought to be "good".

Suppose I query then with "goop" and receive a "yes". I would expect the derived regular expression to be "goo[dp]", not "good|goop".

And so forth.

I do not need backtracking or any other fancy non-linear time operations in my derived regex. Presumably the generated regex would be a DFA under the hood. Is anyone aware of any c/c++ regular expression libraries capable of doing this? Alternatively, reasons why this is a dumb idea and better solutions to my real problem would also be useful.

Hakanai · Accepted Answer

Rather than a regular expression, you could use a Trie.

Then for each new string you walk the trie one node for each character. I suspect that you would also want a marker character for end of string - once you reach this character, if the node exists, it holds the yes/no answer.

Derive minimal regular expression from input

Tags:

c++

c

regex

dfa

tgoodhart

1 Answers

Hakanai

Recent Activity

Donate For Us

Derive minimal regular expression from input

Tags:

c++

c

regex

dfa

tgoodhart

1 Answers

Hakanai

Related questions

Recent Activity

Donate For Us