I've seen a few comments here that mention that modern regular expressions go beyond what can be represented in a regular language. How is this so? What features of modern regular expressions are not regular? Examples would be helpful.

The first thing that comes to mind is backreferences: <pre class="prettyprint"><code>(\w*)\s\1 </code></pre> (matches a group of word characters, followed by a space character and then the same group previously matched) eg: <code>hello hello</code> matches, <code>hello world</code> doesn't. This construct is not regular (ie: can't be generated by a regular grammar). <hr> Another feature supported by Perl Compatible RegExp (PCRE) that is not regular are recursive patterns: <pre class="prettyprint"><code>\((a*|(?R))*\) </code></pre> This can be used to match any combination of balanced parentheses and "a"s (from wikipedia)

Aren't modern regular expression dialects regular?

1 Answers

The first thing that comes to mind is backreferences:

(\w*)\s\1

(matches a group of word characters, followed by a space character and then the same group previously matched) eg: hello hello matches, hello world doesn't.

This construct is not regular (ie: can't be generated by a regular grammar).

Another feature supported by Perl Compatible RegExp (PCRE) that is not regular are recursive patterns:

\((a*|(?R))*\)

This can be used to match any combination of balanced parentheses and "a"s (from wikipedia)

115

answered Nov 16 '22 00:11

NullUserException

Related questions
                            
                                perl6 grammar , not sure about some syntax in an example
                            
                                How do I use a regular expression in XSLT 1.0?
                            
                                Negative look-ahead assertion in list.files in R
                            
                                C++11 regex: digit after capturing group in replacement string
                            
                                Iranian postal code validation
                            
                                RewriteCond in .htaccess with negated regex condition doesn't work?
                            
                                PyCharm and filters for external tools
                            
                                Why are C# compiled regular expressions faster than equivalent string methods?
                            
                                Elegant R function: mixed case separated by periods to underscore separated lower case and/or camel case
                            
                                Regex in Linq statement?
                            
                                Glob Sync Pattern on multiple directories
                            
                                re.findall('(ab|cd)', string) vs re.findall('(ab|cd)+', string)
                            
                                RewriteRule ^ - [L] AKA RewriteRule caret dash L
                            
                                Regex to match all words except a given list
                            
                                Python Regex, re.sub, replacing multiple parts of pattern?
                            
                                Understanding Regular Expressions
                            
                                Regular Expressions in SQL Server servers?
                            
                                Substitute the n-th occurrence of a word in vim
                            
                                Operator precedence in regular expressions
                            
                                Is there a shorter way to pull groups out of a Powershell regex?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Aren't modern regular expression dialects regular?

Tags:

regex

regular-language

David Johnstone

People also ask

1 Answers

NullUserException

Recent Activity

Donate For Us