Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

detecting mistyped email addresses in javascript

I notice sometimes users mistype their email address (in a contact-us form), for example, typing @yahho.com, @yhoo.com, or @yahoo.co instead of @yahoo.com

I feel that this can be corrected on-the-spot with some javascript. Simply check the email address for possible mistakes, such as the ones listed above, so that if the user types [email protected], a non-obtrusive message can be displayed, or something like that, suggesting that he probably means @yahoo.com, and asking to double check he typed his email correctly.

The Question is:
How can I detect -in java script- that a string is very similar to "yahoo" or "yahoo.com"? or in general, how can I detect the level of similarity between two strings?

P.S. (this is a side note) In my specific case, the users are not native English speakers, and most of them are no where near fluent, the site itself is not in English.

like image 952
hasen Avatar asked Jan 20 '09 03:01

hasen


2 Answers

Here's a dirty implementation that could kind of get you some simple checks using the Levenshtein distance. Credit for the "levenshteinenator" goes to this link. You would add whatever popular domains you want to the domains array and it would check to see if the distance of the host part of the email entered is 1 or 2 which would be reasonably close to assume there's a typo somewhere.

levenshteinenator = function(a, b) {
    var cost;

    // get values
    var m = a.length;
    var n = b.length;

    // make sure a.length >= b.length to use O(min(n,m)) space, whatever that is
    if (m < n) {
        var c=a;a=b;b=c;
        var o=m;m=n;n=o;
    }

    var r = new Array();
    r[0] = new Array();
    for (var c = 0; c < n+1; c++) {
        r[0][c] = c;
    }

    for (var i = 1; i < m+1; i++) {
        r[i] = new Array();
        r[i][0] = i;
        for (var j = 1; j < n+1; j++) {
            cost = (a.charAt(i-1) == b.charAt(j-1))? 0: 1;
            r[i][j] = minimator(r[i-1][j]+1,r[i][j-1]+1,r[i-1][j-1]+cost);
        }
    }

    return r[m][n];
}

// return the smallest of the three values passed in
minimator = function(x,y,z) {
    if (x < y && x < z) return x;
    if (y < x && y < z) return y;
    return z;
}

var domains = new Array('yahoo.com','google.com','hotmail.com');
var email = '[email protected]';
var parts = email.split('@');
var dist;
for(var x=0; x < domains.length; x++) {
    dist = levenshteinenator(domains[x], parts[1]);
    if(dist == 1 || dist == 2) {
        alert('did you mean ' + domains[x] + '?');
    }
}
like image 60
Paolo Bergantino Avatar answered Sep 19 '22 10:09

Paolo Bergantino


In addition to soundex, you may also want to have a look at algorithms for determining Levenshtein distance.

like image 23
Abie Avatar answered Sep 18 '22 10:09

Abie