Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are there any regular expression engines that provide visibility into what they're doing? [closed]

Tags:

regex

In every programming language I've worked with, regular expression support (if it exists) is basically a black box: there are some functions like match, scan, etc. that take an expression and return something—often a string, or an array—but they don't report on what they're doing while they're doing it.

I'm wondering if, in any reasonably popular programming language, there is either built-in or library support for matching regular expressions and providing some kind of real-time output (e.g., to standard out) indicating what's happening.

Update: I appreciate the comments so far; however, I'm not asking about a tool that displays the structure of the regular expression itself, which is what debuggex.com and regexper.com appear to do (though that's very cool!). I meant to ask about providing info during the part where the expression is applied to some input.

Here's a hypothetical example: suppose I had the expression "(foo|bar|baz)" and I test this against the string "baz"; then I'm picturing output that might look like...

testing "foo" - nope
testing "bar" - nope
testing "baz" - found match

Obviously it wouldn't look quite like that; but you get the idea.

like image 500
Dan Tao Avatar asked Jun 14 '13 20:06

Dan Tao


People also ask

Is there anything faster than regex?

String operations will always be faster than regular expression operations. Unless, of course, you write the string operations in an inefficient way. Regular expressions have to be parsed, and code generated to perform the operation using string operations.

What are different types of regular expression?

There are also two types of regular expressions: the "Basic" regular expression, and the "extended" regular expression. A few utilities like awk and egrep use the extended expression. Most use the "basic" regular expression. From now on, if I talk about a "regular expression," it describes a feature in both types.

What is a regex engine?

A regex engine executes the regex one character at a time in left-to-right order. This input string itself is parsed one character at a time, in left-to-right order. Once a character is matched, it's said to be consumed from the input, and the engine moves to the next input character. The engine is by default greedy.

How efficient are regular expressions?

Regular Expressions are efficient in that one line of code can save you writing hundreds of lines. But they're normally slower (even pre-compiled) than thoughtful hand written code simply due to the overhead. Generally the simpler the objective the worse Regular Expressions are. They're better for complex operations.


2 Answers

Several regular expression libraries are written in such a way that you can get state by state processing information. In particular, Russ Cox wrote an article on regular expressions that included bits of code and an API for transitioning state by state:

http://swtch.com/~rsc/regexp/regexp1.html

The code used in the article was expanded into a complete, simple regex library that appears to give step by step output similar to what you described:

https://code.google.com/p/re1/

Later, the code was more fully worked out and is now a full blown regex library maintained (and used internally) by Google:

https://code.google.com/p/re2/

EDIT

If you compile re2 with DebugDFA set to true in the source code, you will get state by state output during processing. However, for many regex's it may not correspond 1-1 with the actual regular expression, and the output is a little esoteric.

like image 123
Michael Graczyk Avatar answered Sep 23 '22 20:09

Michael Graczyk


Python's regular expression engine does provide visibility, using the RE.debug flag. You're asking for something different though (realtime feedback) which I'm pretty sure does not exist. I could see it being integrated into an IDE or an enhanced python shell such as ipython. It would be a fun thing to write and quite useful, in my opinion.

like image 38
foobarbecue Avatar answered Sep 24 '22 20:09

foobarbecue