Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex for matching any URL character

Tags:

regex

I have come accross a specification that said described a field as :

Any URL char

And I wanted to validate it on my side via a REGEX.

I searched a bit and, even if I found this great SO question that contains every piece of information I needed, I found it too bad not to have a question asking precisely for the regex, so here I am.

What would be a proper regex matching any URL character ?

Edit

I extracted the following regex from what I understood from the specification :

[\w\-.~:/?#\[\]@!$&'()*+,;=%]

So, is this REGEX right and exhaustive or did I miss anything ?

After reading the specification, I guess it is simply "all ASCII characters".

like image 729
Jeremy Grand Avatar asked Oct 29 '22 08:10

Jeremy Grand


1 Answers

See the Characters section:

A URI is composed from a limited set of characters consisting of digits, letters, and a few graphic symbols. A reserved subset of those characters may be used to delimit syntax components within a URI while the remaining characters, including both the unreserved set and those reserved characters not acting as delimiters, define each component's identifying data.

Although there is an indication that only digits, letters and some symbols are supported, you may see a suggested regex to parse a URI at the Appendix B. Parsing a URI Reference with a Regular Expression that may actually match pretty every char:

The following line is the regular expression for breaking-down a well-formed URI reference into its components.

 ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
   12            3  4          5       6  7        8 9

What you collected as a [\w.~:/?#\[\]@!$&'()*+,;=%-] pattern is too restrictive, unless \w is Unicode aware (URI may contain any Unicode letters), then, it might be working more or less for you.

If you plan to match just ASCII URLs, use ^[\x00-\x7F]+$ (any 1+ ASCII symbols) or ^[!-~]+$ (only visible ASCII).

like image 74
Wiktor Stribiżew Avatar answered Nov 15 '22 07:11

Wiktor Stribiżew