Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can punycode-encoded email addresses clash with "real" addresses?

Tags:

email

punycode

The problem is this: I'm using a third-party Email delivery service that doesn't accept mail addresses with non-ASCII characters in the name part, like mü[email protected] .

Encoding such an address with Punycode:

http://en.wikipedia.org/wiki/Punycode

http://idnaconv.phlymail.de/index.php?decoded=m%C3%BCller%40example.com&idn_version=2008&encode=Encode+%3E%3E&lang=de

yields this address:

[email protected]

And sending mail to it via the service seems to work.

However, I'm not sure if someone couldn't register "[email protected]" directly, thus receiving Emails meant for "mü[email protected]".

Is this clashing possible ? Are there other solutions for this problem ?

UPDATE

Thanks for the answers. Here's a summary of what we learned:

  • Punycoding the local part of the email address works, and you can send and receive from such an encoded address (of course)
  • However, there are no guarantees at all that providers or mail clients will understand the encoding, or do it automatically. Clashes are therefore possible, and the whole idea not a good one :)
  • One should simply do what everyone else does, which is to not allow or accept non-ASCII name parts, as per specification
  • And finally, it turns out the third-party service prohibits such shenanigans anyway.
like image 995
Martin T. Avatar asked Sep 21 '11 09:09

Martin T.


People also ask

Can email addresses be Unicode?

To use Unicode in certain email header fields, e.g. subject lines, sender and recipient names, the Unicode text has to be encoded using a MIME "Encoded-Word" with a Unicode encoding as the charset. To use Unicode in domain part of email addresses, IDNA encoding must traditionally be used.

Do email addresses have to be ascii?

Nowadays, creating email addresses using non-ASCII characters is fortunately possible, as many email servers support the newer IETF standards.

Is @email a valid domain?

A . EMAIL domain name could be used by email marketing service providers, or for a website dedicated to email marketing news and trends. There are no restrictions on . EMAIL domain names.

Can email addresses have diacritics?

It is possible to have email addresses with accented characters, typical Scandinavian and German ones include å,ä and ö.


2 Answers

The only standard way to send non us-ascii characters in the local-part of a email address is through rfc6532 (Internationalized email headers) and rfc6531 (SMTP Extension for SMTPUTF8).

As far as I know there is no standard way to encode non us-ascii chars in a local part of a email address notably:

  • Puny code is for domain names only, not the local part. But you can have a local part which happens to look like the puny encoding of some string but it should be displayed in it's puny encoded form. If a mail program decides to display it after puny decoding it it's non standard behavior.

  • The encoded word encoding mechanism mentioned in one of the answers (the =?utf-8?Q?foobar?= thing) is not applicable to the local part of a mail address, only to the display name of a mailbox (which is something different, but related i.e. the thing your mail program might display instead of the mail address).

In the end this means that [email protected] and [email protected] are two completely unrelated email addresses which just would have the same meaning if they would have been domains (but they are not so they can collide).

Theoretically you could hope that by now (2019) all mail servers support SMTPUTF8 and all client support internationalized mails, but sadly I would not count on it if it's important.

Btw. it happens that the local part of a email address is the only thing in the mail standard(s) where you might want to have non us-ascii chars and there is no way to encode it (as far as I know). All other parts either have encoded word, puny, percent, base64, quoted-printable or some other form of encoding mechanism.

like image 105
rustonaut Avatar answered Oct 13 '22 09:10

rustonaut


I got bored and was researching this tonight, and apparently this is now codified in the Extended SMTP standard, specifically SMTPUTF8 as per RFC 6531. See http://en.wikipedia.org/wiki/Extended_SMTP#SMTPUTF8

My brief experiment using emoji mailbox names returned the following error when sending via Gmail:

local-part of envelope contains utf8 but remote server did not offer SMTPUTF8

This is the same regardless whether I used the emoji or punycode version of the address.

like image 32
aendra Avatar answered Oct 13 '22 09:10

aendra