Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Javascript find if english alphabets only

Am trying to find some text only if it contains english letters and numbers using Javascript/jQuery.

Am wondering what is the most efficient way to do this? Since there could be thousands of words, it should be as fast as possible and I don't want to use regex.

 var names[0] = 'test';
 var names[1] = 'हिन';
 var names[2] = 'لعربية';

 for (i=0;i<names.length;i++) {
    if (names[i] == ENGLISHMATCHCODEHERE) {
        // do something here
    }
 }

Thank you for your time.

like image 999
Alec Smart Avatar asked Mar 08 '10 16:03

Alec Smart


2 Answers

A regular expression for this might be:

var english = /^[A-Za-z0-9]*$/;

Now, I don't know whether you'll want to include spaces and stuff like that; the regular expression could be expanded. You'd use it like this:

if (english.test(names[i])) // ...

Also see this: Regular expression to match non-English characters?

edit my brain filtered out the "I don't want to use a regex" because it failed the "isSilly()" test. You could always check the character code of each letter in the word, but that's going to be slower (maybe much slower) than letting the regex matcher work. The built-in regular expression engine is really fast.

When you're worried about performance, always do some simple tests first before making assumptions about the technology (unless you've got intimate knowledge of the technology already).

like image 150
Pointy Avatar answered Oct 06 '22 20:10

Pointy


If you're dead set against using regexes, you could do something like this:

// Whatever valid characters you want here
var ENGLISH = {};
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789".split("").forEach(function(ch) {
    ENGLISH[ch] = true;
});

function stringIsEnglish(str) {
    var index;

    for (index = str.length - 1; index >= 0; --index) {
        if (!ENGLISH[str.substring(index, index + 1)]) {
            return false;
        }
    }
    return true;
}

Live Example:

// Whatever valid characters you want here
var ENGLISH = {};
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789".split("").forEach(function(ch) {
    ENGLISH[ch] = true;
});

function stringIsEnglish(str) {
    var index;

    for (index = str.length - 1; index >= 0; --index) {
        if (!ENGLISH[str.substring(index, index + 1)]) {
            return false;
        }
    }
    return true;
}

console.log("valid", stringIsEnglish("valid"));
console.log("invalid", stringIsEnglish("invalid!"));

...but a regex (/^[a-z0-9]*$/i.test(str)) would almost certainly be faster. It is in this synthetic benchmark, but those are often unreliable.

like image 33
T.J. Crowder Avatar answered Oct 06 '22 20:10

T.J. Crowder