Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP: fastest way to check for invalid characters (all but a-z, A-Z, 0-9, #, -, ., $)?

I have to check the buffer input to a PHP socket server as fast as possible. To do so, I need to know if the input message $buffer contains any other character(s) than the following: a-z, A-Z, 0-9, #, -, . and $

I'm currently using the following ereg function, but wonder if there are ways to optimize the speed. Should I maybe use a different function, or a different regex?

if (ereg("[A-Za-z0-9]\.\#\-\$", $buffer) === false)
{
    echo "buffer only contains valid characters: a-z, A-Z, 0-9, #, -, ., $";
}
like image 809
Tom Avatar asked Nov 14 '09 23:11

Tom


4 Answers

Try this function:

function isValid($str) {
    return !preg_match('/[^A-Za-z0-9.#\\-$]/', $str);
}

[^A-Za-z0-9.#\-$] describes any character that is invalid. If preg_match finds a match (an invalid character), it will return 1 and 0 otherwise. Furthermore !1 is false and !0 is true. Thus isValid returns false if an invalid character is found and true otherwise.

like image 154
Gumbo Avatar answered Nov 02 '22 14:11

Gumbo


Only allowing characters a-z uppercase or lowercase..

if (preg_match("/[^A-Za-z]/", $FirstName))
{
    echo "Invalid Characters!";
}

Adding numbers..

if (preg_match("/[^A-Za-z0-9]/", $FirstName))
{
    echo "Invalid Characters!";
}

Adding additional characters to allow (in this case the exclamation mark)..

(Additional characters must be preceded with an "\" as shown.)

if (preg_match("/[^A-Za-z0-9\!]/", $FirstName))
{
    echo "Invalid Characters!";
}
like image 29
Richard Avatar answered Nov 02 '22 13:11

Richard


The preg family of functions is quite a bit faster than ereg. To test for invalid characters, try something like:

if (preg_match('/[^a-z0-9.#$-]/i', $buffer)) print "Invalid characters found";
like image 3
Richard Simões Avatar answered Nov 02 '22 12:11

Richard Simões


You'll want to shift over to using preg instead of ereg. The ereg family of functions have been depreciated, and (since php 5.3) using them will throw up a PHP warning, and they'll be removed from teh language soon. Also, it's been anecdotal wisdom that the preg functions are, in general, faster than ereg.

As for speed, based on my experience and the codebases I've seen in my career, optimizing this kind of string performance would be premature at this point. Wrap the comparision in some logical function or method

//pseudo code based on OP 
function isValidForMyNeeds($buffer)
{
    if (ereg("[A-Za-z0-9]\.\#\-\$", $buffer) === false)
    {
        echo "buffer only contains valid characters: a-z, A-Z, 0-9, #, -, ., $";
    }
}

and then when/if you determine this is a performance problem you can apply any needed optimization in one place.

like image 3
Alan Storm Avatar answered Nov 02 '22 13:11

Alan Storm