Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to tell if a random string is an email address or something else

Tags:

java

email

I don't think that this question has been asked before... I certainly cannot find something with this requirement.

Background

There is an API that returns ID's of people. In general the ID should be treated as being case sensitive... but if the ID is actually their email address... and you are talking to a less than stellar implementation of this API that returns a mixed case version of their email address, there is plenty of fun to be had...

So you are talking to one implementation... it gives you back URL like things as the ID, e.g.

  • http://foo.bar.com/blahblahblah

You could next be talking to another implementation... that gives you back some non-obvious ID, e.g.

  • as€jlhdésdj678hjghas7t7qhjdhg£

You could be talking to a nice implementation which gives you back a nice lowercase email address:

Or you could be talking to the less than stellar implementation that returns the exactly equivalent ID

RFC 2821 states that only the mailbox is case sensitive, but that exploiting the case sensitivity will cause a raft of inter-op issues...

What I want to do is identify the strings that are emails and force the domain to lowercase. Identifying the URI like strings is easier as the scheme is either http or https and I just need to lowercase the domain name which is a lot easier to parse.

Question

If given a string provided by an external service, is there a test I can use that will determine if the string is an email address so I can force the domain name to lower case?

It is acceptable for a small % of email addresses to be missed and not get the domain name lowercased. (False negatives allowed)

It is not acceptable to force part of a string to lowercase if it is not the domain part of an email address. (False positives not allowed)

 Update

Note that this question is subtly different from this and this as in the context of those two questions you already know that the string is supposed to be an email address.

In the context of this question we do not know if the string is an email address or something else... which makes this question different

like image 894
Stephen Connolly Avatar asked Aug 27 '13 11:08

Stephen Connolly


3 Answers

- Try the below code, this may be helpful to you.

public class EmailCheck {

    public static void main(String[] args){


        String email = "[email protected]";
        Pattern pattern = Pattern.compile("[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Za-z]{2,4}");
        Matcher mat = pattern.matcher(email);

        if(mat.matches()){

            System.out.println("Valid email address");
        }else{

            System.out.println("Not a valid email address");
        }
    }

}

- Also take a look at this site, which shows another deeper validation using regular expression. Deeper validation using regular expression

like image 130
Kumar Vivek Mitra Avatar answered Sep 30 '22 07:09

Kumar Vivek Mitra


You can use following for verifying an email;

String email ="[email protected]"
Pattern p = Pattern.compile(".+@.+\\.[a-z]+");
Matcher m = p.matcher(email);
boolean matchFound = m.matches();
if (matchFound) {
    //your work here
}
like image 32
Shiv Avatar answered Sep 30 '22 08:09

Shiv


Thanks to @Dukeling

private static toLowerCaseIfEmail(String string) {
    try {
        new InternetAddress(string, true);
    } catch (AddressException e) {
        return string;
    }
    if (string.trim().endsWith("]")) {
        return string;
    }
    int lastAt = string.lastIndexOf('@');
    if (lastAt == -1) {
        return string;
    }
    return string.substring(0,lastAt)+string.substring(lastAt).toLowerCase();
}

should, from what I can tell, do the required thing.

Update

Since the previous one ignored the possibility of (comment) syntax after the last @... which lets face it, if we see them should just bail out fast and return the string unmodified

private static toLowerCaseIfEmail(String string) {
    try {
        new InternetAddress(string, true);
    } catch (AddressException e) {
        return string;
    }
    int lastAt = string.lastIndexOf('@');
    if (lastAt == -1 
        || string.lastIndexOf(']') > lastAt
        || string.lastIndexOf(')' > lastAt) {
        return string;
    }
    return string.substring(0,lastAt)+string.substring(lastAt).toLowerCase();
}
like image 25
Stephen Connolly Avatar answered Sep 30 '22 06:09

Stephen Connolly