Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

xn-- on domain, what it means?

Tags:

dns

I want to know what the xn-- (domain) -66b.com means in a domain. For example, I bought diseñolatinoamericano.com with ñ.

And in mozilla it appears http://xn--diseolatinoamericano-66b.com/ also in Facebook I can't link anything.

Thanks!

like image 213
Jepser Bernardino Avatar asked Mar 15 '12 16:03

Jepser Bernardino


People also ask

What symbols can be in a domain?

Domain names can only use letters, numbers, the fada character (acute accent) and hyphens (“-“). Spaces and other symbols are not permitted for use. Names cannot begin or end with a hyphen and are not case sensitive. Domains cannot exceed 63 characters.

What is the meaning of domain code?

A domain name is a string of text that maps to a numeric IP address, used to access a website from client software. In plain English, a domain name is the text that a user types into a browser window to reach a particular website. For instance, the domain name for Google is 'google.com'.


3 Answers

Its the result of IDNA encoding; i.e. converting your unicode domain name to its ASCII equivalent which has to be done as DNS is not unicode-aware.

The xn-- says "everything that follows is encoded-unicode".

like image 130
Alex K. Avatar answered Oct 22 '22 16:10

Alex K.


This is Punycode which is used to Internationalize Domain Names in Applications.

From 1:

Punycode is intended for the encoding of labels in the Internationalized Domain Names in Applications (IDNA) framework, such that these domain names may be represented in the ASCII character set allowed in the Domain Name System of the Internet. The encoding syntax is defined in IETF document RFC 3492.

From 2:

Internationalizing Domain Names in Applications (IDNA) is a mechanism defined in 2003 for handling internationalized domain names containing non-ASCII characters. These names either are Latin letters with diacritics (ñ, é) or are written in languages or scripts which do not use the Latin alphabet: Arabic, Hangul, Hiragana and Kanji for instance. Although the Domain Name System supports non-ASCII characters, applications such as e-mail and web browsers restrict the characters which can be used as domain names for purposes such as a hostname.

like image 32
Jonas Schäfer Avatar answered Oct 22 '22 16:10

Jonas Schäfer


The (simplified) semantic of 66b (i.e. the string after the last -) in your example is: "Move the cursor in diseolatinoamericano 4 chars to the right and insert a ñ". The one bigger code 76b (in little endian) means to move one more char and so:

$ idn2 -d xn--diseolatinoamericano-76b
diseoñlatinoamericano

. If you further increase the code you get:

-76b -> diseoñlatinoamericano
-86b -> diseolñatinoamericano
-96b -> diseolañtinoamericano
-b7b -> diseolatñinoamericano
-c7b -> diseolatiñnoamericano
-d7b -> diseolatinñoamericano
-e7b -> diseolatinoñamericano
-f7b -> diseolatinoañmericano
...
-m7b -> diseolatinoamericanño
-n7b -> diseolatinoamericanoñ

resulting in the position of the ñ moving further to the right.

After this increasing the code once more resets the insertion position to the start of the string and increases the codepoint of the character to insert by one. ñ has the codepoint 241, the next is ò and so:

-o7b -> òdiseolatinoamericano
-p7b -> dòiseolatinoamericano
...

The exact details (e.g. why -a6b had to be skipped above) can be found in rfc3492.

like image 37
Uwe Kleine-König Avatar answered Oct 22 '22 18:10

Uwe Kleine-König