Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The Hostname Regex

Tags:

regex

I'm looking for the regex to validate hostnames. It must completely conform to the standard. Right now, I have

^[0-9a-z]([0-9a-z\-]{0,61}[0-9a-z])?(\.[0-9a-z](0-9a-z\-]{0,61}[0-9a-z])?)*$

but it allows successive hypens and hostnames longer than 255 characters. If the perfect regex is impossible, say so.

Edit/Clarification: a Google search didn't reveal that this is a solved (or proven unsolvable) problem. I want to to create the definitive regex so that nobody has to write his own ever. If dialects matter, I want a a version for each one in which this can be done.

like image 329
CannibalSmith Avatar asked Sep 13 '09 18:09

CannibalSmith


People also ask

What does '$' mean in regex?

$ means "Match the end of the string" (the position after the last character in the string).

How do I find the regex for a domain name?

[A-Za-z0-9-]{1, 63} represents the domain name should be a-z or A-Z or 0-9 and hyphen (-) between 1 and 63 characters long. (? <!

What is a valid hostname?

Valid characters for hostnames are ASCII(7) letters from a to z, the digits from 0 to 9, and the hyphen (-). A hostname may not start with a hyphen. Hostnames are often used with network client and server programs, which must generally translate the name to an address for use.


2 Answers

^(?=.{1,255}$)[0-9A-Za-z](?:(?:[0-9A-Za-z]|-){0,61}[0-9A-Za-z])?(?:\.[0-9A-Za-z](?:(?:[0-9A-Za-z]|-){0,61}[0-9A-Za-z])?)*\.?$

like image 141
CannibalSmith Avatar answered Sep 25 '22 20:09

CannibalSmith


The approved answer validates invalid hostnames containing multiple dots (example..com). Here is a regex I came up with that I think exactly matches what is allowable under RFC requirements (minus an ending "." supported by some resolvers to short-circuit relative naming and force FQDN resolution).

Spec:

<hname> ::= <name>*["."<name>] <name> ::= <letter-or-digit>[*[<letter-or-digit-or-hyphen>]<letter-or-digit>] 

Regex:

^([a-zA-Z0-9](?:(?:[a-zA-Z0-9-]*|(?<!-)\.(?![-.]))*[a-zA-Z0-9]+)?)$ 

I've tested quite a few permutations myself, I think it is accurate.

This regex also does not do length validation. Length constraints on labels betweens dots and on names are required by RFC, but lengths can easily be checked as second and third passes after validating against this regex, by checking full string length, and by splitting on "." and validating all substrings lengths. E.g., in JavaScript, label length validation might look like: "example.com".split(".").reduce(function (prev, curr) { return prev && curr.length <= 63; }, true).


Alternative Regex (without negative lookbehind, courtesy of the HTML Living Standard):

^[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$ 
like image 22
derekm Avatar answered Sep 24 '22 20:09

derekm