Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Validation/Format of display-name in from header

Tags:

smtp

rfc

rfc2822

I need to know what are the rules for validation/format from(name-addr) field in the email. In rfc explained the format of name-addr, but goes into detail about the display-name.

Like this:

From: John Q. Public <[email protected]>

I want to know the characters and length allowed. How do I know that John Q. Public has valid characters? Should I allow only printable US-ASCII characters ?

I consulted the RFC 2822 and not found on the specific format of a display name

like image 963
Iago Avatar asked Jul 24 '14 17:07

Iago


1 Answers

This is all defined in the rfc you linked to in your question (btw, the newer version of this document is RFC 5322):

display-name    =       phrase
phrase          =       1*word / obs-phrase
word            =       atom / quoted-string
atom            =       [CFWS] 1*atext [CFWS]
atext           =       ALPHA / DIGIT / ; Any character except controls,
                        "!" / "#" /     ;  SP, and specials.
                        "$" / "%" /     ;  Used for atoms
                        "&" / "'" /
                        "*" / "+" /
                        "-" / "/" /
                        "=" / "?" /
                        "^" / "_" /
                        "`" / "{" /
                        "|" / "}" /
                        "~"
specials        =       "(" / ")" /     ; Special characters used in
                        "<" / ">" /     ;  other parts of the syntax
                        "[" / "]" /
                        ":" / ";" /
                        "@" / "\" /
                        "," / "." /
                        DQUOTE

You have to jump around in the document a bit to find the definitions of each of these token types, but they are all there.

Once you have the definitions, all you need to do is scan over your name string and see if it consists only of the valid characters.

According to the definitions, a display-name is a phrase and a phrase is 1-or-more word tokens (or an obs-word which I'll ignore for now to make this explanation simpler).

A word token can be either an atom or a quoted-string.

In your example, John Q. Public contains a special character, ".", which cannot appear within an atom token. What about a quoted-string token? Well, let's see...

quoted-string   =       [CFWS]
                        DQUOTE *([FWS] qcontent) [FWS] DQUOTE
                        [CFWS]
qcontent        =       qtext / quoted-pair
qtext           =       NO-WS-CTL /     ; Non white space controls
                        %d33 /          ; The rest of the US-ASCII
                        %d35-91 /       ;  characters not including "\"
                        %d93-126        ;  or the quote character

Based on this, we can tell that a "." is allowed within a quoted-string, so... the correct formatting for your display-name can be any of the following:

From: "John Q. Public" <[email protected]>

or

From: John "Q." Public <[email protected]>

or

From: "John Q." Public <[email protected]>

or

From: John "Q. Public" <[email protected]>

Any one of those will work.

like image 106
jstedfast Avatar answered Oct 23 '22 14:10

jstedfast