Supposedly string is:
$a = "abc-def"
if (preg_match("/[^a-z0-9]/i", $a, $m)){
$i = "i stopped scanning '$a' because I found a violation in it while
scanning it from left to right. The violation was: $m[0]";
}
echo $i;
example above: should indicate "-" was the violation.
I would like to know if there is a non-preg_match way of doing this.
I will likely run benchmarks if there is a non-preg_match way of doing this perhaps 1000 or 1 million runs, to see which is faster and more efficient.
In the benchmarks "$a" will be much longer. To ensure it is not trying to scan the entire "$a" and to ensure it stops soon as it detects a violation within the "$a"
Based on information I have witnessed on the internet, preg_match stops when the first match is found.
UPDATE:
this is based on the answer that was given by "bishop" and will likely to be chosen as the valid answer soon ( shortly ).
i modified it a little bit because i only want it to report the violator character. but i also commented that line out so benchmark can run without entanglements.
let's run a 1 million run based on that answer.
$start_time = microtime(TRUE);
$count = 0;
while ($count < 1000000){
$allowed = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789';
$input = 'abc-def';
$validLen = strspn($input, $allowed);
if ($validLen < strlen($input)){
#echo "violation at: ". substr($input, $validLen,1);
}
$count = $count + 1;
};
$end_time = microtime(TRUE);
$dif = $end_time - $start_time;
echo $dif;
the result is: 0.606614112854
( 60 percent of a second )
let's do it with the preg_match method.
i hope everything is the same. ( and fair ).. ( i say this because there is the ^ character in the preg_match )
$start_time = microtime(TRUE);
$count = 0;
while ($count < 1000000){
$input = 'abc-def';
preg_match("/[^a-z0-9]/i", $input, $m);
#echo "violation at:". $m[0];
$count = $count + 1;
};
$end_time = microtime(TRUE);
$dif = $end_time - $start_time;
echo $dif;
i use "dif" in reference to the terminology "difference".
the "dif" was.. 1.1145210266113
( took 11 percent more than a whole second )
( if it was 1.2 that would mean it is 2x slower than the php way )
You want to find the location of the first character not in the given range, without using regular expressions? You might want strspn
or its complement strcspn
:
$allowed = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789';
$input = 'abc-def';
$validLen = strspn($input, $allowed);
if (strlen($input) !== $validLen) {
printf('Input invalid, starting at %s', substr($input, $validLen));
} else {
echo 'Input is valid';
}
Outputs Input invalid, starting at -def
. See it live.
strspn
(and its complement) are very old, very well specified (POSIX even). The standard implementations are optimized for this task. PHP just leverages that platform implementation, so PHP should be fast, too.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With