Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regex to validate a message-ID as per RFC2822

I have not found a regexp to do this. I need to validate the "Message-ID:" value from an email. It is similar to a email address validation regexp but much simpler, without most of the edge cases the email address allows, from rfc2822

msg-id          =       [CFWS] "<" id-left "@" id-right ">" [CFWS] 
id-left         =       dot-atom-text / no-fold-quote / obs-id-left
id-right        =       dot-atom-text / no-fold-literal / obs-id-right
no-fold-quote   =       DQUOTE *(qtext / quoted-pair) DQUOTE
no-fold-literal =       "[" *(dtext / quoted-pair) "]"

Let's say the outter <> are optional. dot-atom-text and missing definitions can be found in rfc2822

I am not proficient in regex and I prefer to use an already tested one, if exists.

like image 700
Persimmonium Avatar asked Oct 19 '10 12:10

Persimmonium


2 Answers

If anyone's interested, one of our senior architects worked through the many layers of RFC 2822 and came up with the following regex which includes quoting on the left and right sides. The spec says that new implementations should not use the obsolete characters, so this regex does not allow them:

((([a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+(\.[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+)*)|("(([\x01-\x08\x0B\x0C\x0E-\x1F\x7F]|[\x21\x23-\x5B\x5D-\x7E])|(\\[\x01-\x09\x0B\x0C\x0E-\x7F]))*"))@(([a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+(\.[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+)*)|(\[(([\x01-\x08\x0B\x0C\x0E-\x1F\x7F]|[\x21-\x5A\x5E-\x7E])|(\\[\x01-\x09\x0B\x0C\x0E-\x7F]))*\])))
like image 134
Nathan Avatar answered Oct 12 '22 16:10

Nathan


As I could not find any I ended up implementing it myself. It is not a proper validation as per RFC2822 but a good enough aproximation for now:

static String VALIDMIDPATTERN = "[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*";
private static Pattern patvalidmid = Pattern.compile(VALIDMIDPATTERN);

public static boolean isMessageIdValid(String midt) {
    String mid = midt;
    if (StringUtils.countMatches(mid, "<") > 1)
        return false;
    if (StringUtils.countMatches(mid, ">") > 1)
        return false;
    if (StringUtils.containsAny(mid, "<>")) {
        mid = StringUtils.substringBetween(mid, "<", ">");
        if (StringUtils.isBlank(mid)) {
            return false;
        }
    }
    if (StringUtils.contains(mid, "..")) {
        return false;
    }
    //extract from <>
    mid = mid.trim();
    //now validate
    Matcher m = patvalidmid.matcher(mid);
    return m.matches();
}
like image 38
Persimmonium Avatar answered Oct 12 '22 17:10

Persimmonium