Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fully qualified domain name validation

Tags:

regex

bash

fqdn

Is there a quick and dirty way to validate if the correct FQDN has been entered? Keep in mind there is no DNS server or Internet connection, so validation has to be done via regex/awk/sed.

Any ideas?

like image 226
Riaan Avatar asked Aug 04 '12 15:08

Riaan


People also ask

What is considered a fully qualified domain name?

A fully qualified domain name (FQDN) is the complete domain name for a specific computer, or host, on the internet. The FQDN consists of two parts: the hostname and the domain name. For example, an FQDN for a hypothetical mail server might be mymail.somecollege.edu .

What characters are allowed in FQDN?

Each label must consist of 1 to 63 characters and the total FQDN may not exceed 255 characters in total. Only letters, numbers, or dashes can be used. Each label has to have either a letter or a number at the beginning.


1 Answers

(?=^.{4,253}$)(^((?!-)[a-zA-Z0-9-]{1,63}(?<!-)\.)+[a-zA-Z]{2,63}$) 

regex is always going to be at best an approximation for things like this, and rules change over time. the above regex was written with the following in mind and is specific to hostnames-

Hostnames are composed of a series of labels concatenated with dots. Each label is 1 to 63 characters long, and may contain:

  • the ASCII letters a-z (in a case insensitive manner),
  • the digits 0-9,
  • and the hyphen ('-').

Additionally:

  • labels cannot start or end with hyphens (RFC 952)
  • labels can start with numbers (RFC 1123)
  • max length of ascii hostname including dots is 253 characters (not counting trailing dot) (http://blogs.msdn.com/b/oldnewthing/archive/2012/04/12/10292868.aspx)
  • underscores are not allowed in hostnames (but are allowed in other DNS types)

some assumptions:

  • TLD is at least 2 characters and only a-z
  • we want at least 1 level above TLD

results: valid / invalid

  • 911.gov - valid
  • 911 - invalid (no TLD)
  • a-.com - invalid
  • -a.com - invalid
  • a.com - valid
  • a.66 - invalid
  • my_host.com - invalid (undescore)
  • typical-hostname33.whatever.co.uk - valid

EDIT: John Rix provided an alternative hack of the regex to make the specification of a TLD optional:

(?=^.{1,253}$)(^(((?!-)[a-zA-Z0-9-]{1,63}(?<!-))|((?!-)[a-zA-Z0-9-]{1,63}(?<!-)\.)+[a-zA-Z]{2,63})$) 
  • 911 - valid
  • 911.gov - valid

EDIT 2: someone asked for a version that works in js. the reason it doesn't work in js is because js does not support regex look behind. specifically, the code (?<!-) - which specifies that the previous character cannot be a hyphen.

anyway, here it is rewritten without the lookbehind - a little uglier but not much

(?=^.{4,253}$)(^((?!-)[a-zA-Z0-9-]{0,62}[a-zA-Z0-9]\.)+[a-zA-Z]{2,63}$) 

you could likewise make a similar replacement on John Rix's version.

EDIT 3: if you want to allow trailing dots - which is technically allowed:

(?=^.{4,253}\.?$)(^((?!-)[a-zA-Z0-9-]{1,63}(?<!-)\.)+[a-zA-Z]{2,63}\.?$) 

I wasn't familiar with trailing dot syntax till @ChaimKut pointed them out and I did some research

  • http://dns-sd.org./TrailingDotsInDomainNames.html
  • https://jdebp.eu./FGA/web-fully-qualified-domain-name.html

Using trailing dots however seems to cause somewhat unpredictable results in the various tools I played with so I would be advise some caution.

like image 73
bkr Avatar answered Sep 28 '22 01:09

bkr