Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does Nginx unquote/unescape URL characters before matching?

Tags:

regex

nginx

When I want to match a URL with spaces in it, the spaces may be encoded as %20 or +. In order to match this in an Nginx regex, what pattern do I need to use?

Does Nginx pass the URL through as-is?

(?:%20|\+| )

or, does Nginx do some unquoting or unescaping first?

(?:\+| )

or is + normalized?

like image 775
David Eyk Avatar asked Nov 13 '22 14:11

David Eyk


1 Answers

Though I didn't find any references in the Nginx documentation with a quick look, from my testing, Nginx normalizes HTTP codes as unicode such that '%20' is matched with a '\s'. '+' is already in unicode and doesn't need to be normalized.

Eg. /route/the%20test

Should match with (?:\s)

However, I tend to lean on the safer side and use something like: (?:(\s|\%20))

like image 148
yekta Avatar answered Dec 01 '22 21:12

yekta