Most efficient regular expression for Nginx location

Tags:

What is the most efficient way to define a location directive which matches something like

location = /[0-9a-zA-Z_-]{1,6} { content_by_lua_file ....}

In other words a URI which matches a string from 1 to 6 characters with "-", "_", digits and letters.

Or is it faster to check string length within my LUA code, which will generate the output by using a location directive like

location  / {content_by_lua_file...}

311

asked Oct 23 '13 20:10

user1606908

2 Answers

Regular expressions are very efficient at what they do.

When the task is trivial (for instance checking the presence of a particular string), a string function can be faster than a regex—depending on the platform. Here, you are checking both for a character range and a length. It's unlikely that Lua code (compiled at run time) will be faster than the pre-compiled C code of the PCRE regex library used by nginx.

In general, the regex for a string from 1 to 6 characters with "-", "_", digits and letters can be written as

^[-\w]{1,6}$

That is because

The ^ anchor asserts that we are at the beginning of the string
The \w word character matches letters, digits and the underscore character
The $ anchor asserts that we are at the end of the string

However, in nginx, the ~ (request starts with) operator allows us to drop the beginning anchor ^. You would write something like this:

location ~ [-\w]{1,6}$ {
    # some rewrite code, for example
    # rewrite ^([^/]+)/?$ /oldsite/$1 break;
}

One more morsel of information for the curious: in Lua itself, the above regex could be turned into a Lua pattern, where % is used in place of \ to form metacharacters:

^[-%w]{1,6}$

Reference

ngx_http_rewrite_module
Lua Patterns

192

answered Oct 19 '22 10:10

zx81

I think that in Lua you will have to check not only length, but also the content of string.
Nginx uses the C library PCRE for regular expressions.
There is also PCRE-JIT which JIT compiles regular expression, particularly useful if the regular expression is more complex than the one in your question. I think in Nginx it's faster.

answered Oct 19 '22 10:10

Rhim

Related questions
                            
                                regular expression to detect numbers written as words
                            
                                Javascript REGEX: How to get `1` and not `11`
                            
                                Regex to match on capital letter, digit or capital, lowercase, and digit
                            
                                Ruby Koans - Regex and .sub: Don't understand reason behind answer
                            
                                How can Python regex ignore case inside a part of a pattern but not the entire expression? [duplicate]
                            
                                Can you retrieve multiple regex matches in JavaScript?
                            
                                Regular Expressions C++ Qt
                            
                                How to remove ETX character from the end of a string? (Regex or PHP)
                            
                                Automatically built regex expressions that fit set of strings
                            
                                vim call function on a group in substitute string
                            
                                Ontology-based string classification
                            
                                Matching plurals using regex in C#
                            
                                java - Why replaceAll is not working?
                            
                                Url routing regex PHP
                            
                                Regular expression that never finishes running
                            
                                Regular expressions - Matching whitespace
                            
                                Scala Regex union
                            
                                In .NET's RegEx can I get a Groups collection from a Capture object?
                            
                                How can I use javascript split method using escape character? [duplicate]
                            
                                Nginx Block/Deny Access to multiple locations regex

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Most efficient regular expression for Nginx location

Tags:

regex

nginx

webserver

user1606908

People also ask

2 Answers

zx81

Rhim

Recent Activity

Donate For Us