Distance between regular expression

Question

Can we compute a sort of distance between regular expressions ?

The idea is to mesure in which way two regular expression are similar.

Gumbo · Accepted Answer

You can build deterministic finite-state machines for both regular expressions and compare the transitions. The difference of both transitions can then be used to measure the distance of these regular expressions.

David Kanarek · Answer

There are a few of metrics you could use:

The length of a valid match. Some regexs have a fixed size, some an upper limit and some a lower limit. Compare how similar their lengths or possible lengths are.
The characters that match. Any regex will have a set of characters a match can contain (maybe all characters). Compare the set of included characters.
Use a large document and see how many matches each regex makes and how many of those are identical.

Are you looking for strict equivalence?

CraigTP · Answer

I suppose you could compute a Levenshtein Distance between the actual Regular Experssion strings. That's certainly one way of measuring a "distance" between two different Regular Expression strings.

Of course, I think it's possible that regular expressions are not required here at all, and computing the Levenshtein Distance of the actual "value" strings that the Regular Expressions would otherwise be applied to, may yield a better result.

Distance between regular expression

Tags:

regex

comparison

formal-languages

Nicolas NOEL

3 Answers

Gumbo

David Kanarek

CraigTP

Recent Activity

Donate For Us

Distance between regular expression

Tags:

regex

comparison

formal-languages

Nicolas NOEL

3 Answers

Gumbo

David Kanarek

CraigTP

Related questions

Recent Activity

Donate For Us