Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

preg_match() vs strpos() for match finding?

Tags:

php

People also ask

What is the use of Preg_match () method?

The preg_match() function returns whether a match was found in a string.

What is the purpose of Preg_match () regular expression in PHP?

The preg_match() function will tell you whether a string contains matches of a pattern.

What value is return by Preg_match?

Return Values ¶ preg_match() returns 1 if the pattern matches given subject , 0 if it does not, or false on failure. This function may return Boolean false , but may also return a non-Boolean value which evaluates to false . Please read the section on Booleans for more information.


I would prefer the strpos over preg_match, because regexes are generally more expensive to execute.

According to the official php docs for preg_match:

Do not use preg_match() if you only want to check if one string is contained in another string. Use strpos() or strstr() instead as they will be faster.


When in doubt, benchmark!

Obviously we could have come up with a better benchmark than this, but just to prove the point that as it starts to scale up, strpos() is going to be quite a bit faster. (almost 2x as fast here)

EDIT I later noticed that the regex was case-insensitive. When running this again using stripos() for a more fair comparison, the result is 11 to 15, so the gap narrows but preg_match() remains a lot slower.

$str = "the quick brown fox";
$start1 = time();
for ($i = 0; $i<10000000; $i++)
{
    if (strpos($str, 'fox') !== false)
    {
        //
    }
}
$end1 = time();
echo $end1 - $start1 . "\n";

$start2 = time();
for ($i = 0; $i<10000000; $i++)
{
    if (preg_match('/fox/i', $str))
    {
        //
    }
}
$end2 = time();
echo $end2 - $start2;

// Results:
strpos() = 8sec
preg_match() = 15sec

// Results both case-insensitive (stripos()):
stripos() = 11sec
preg_match() = 15sec

Never use regular expressions unless absolutely necessary. The overhead involved in starting up and deploying the regex engine on a string like this is similar to using a jackhammer instead of a regular hammer, a drill instead of a screwdriver.

You also have a greater margin of error with regex – mismatched strings, unexpected results, etc. Stick with strpos unless strpos isn't flexible enough.


If you're already using preg_match and preg_replace all over the place in your code, then go on and use it once more. Why?

  1. Performance. Most of the overhead those function add is in the initial load time of the engine, if you already paid that price, make it worth it.

  2. Readability. strpos(...)!==false, while faster, is an incredibile eyesore.

    It is one of the ugliest php constructs.
    The usage of == and false in it are really kludgy and look hard to parse and frail to edit.

Shame on the core team for not having defined an alias like strcontains() for it, years ago.
Now it's well too late to do that, but it would have been nice, back then.


Good Code More Important

So, if you think this kind of thing is important, keep in mind that it's a constant in Big O. To put it another way, database calls, On2 or worse activities are the only things that matter. In most cases, it's futile to spend time worrying about these low-level commands.

Not to imply that constants should be ignored; for example, I refactored code that gathered images since it did so one at a time, each taking 1 second, and it decreased the duration from 12 seconds to 1 second (using multi curl request). The idea is that built-in commands are low-level, and the code structure is more crucial.

The code below makes 10 million lower level calls, and as you can see the "savings" are negligible.

function prof_flag($str)
{
    global $prof_timing, $prof_names;
    $prof_timing[] = microtime(true);
    $prof_names[] = $str;
}

function prof_print()
{
    global $prof_timing, $prof_names;
    $size = count($prof_timing);
    for($i=0;$i<$size - 1; $i++)
{
    echo "<b>{$prof_names[$i]}</b><br>";
        echo sprintf("&nbsp;&nbsp;&nbsp;%f<br>",     $prof_timing[$i+1]-$prof_timing[$i]);
    }
    echo "<b>{$prof_names[$size-1]}</b><br>";
}


$l = 10000000;
$str = "the quick brown fox";
echo "<h3>Ran " .number_format($l,2) ." calls per command </h3>";

prof_flag("Start: stripos");

for ($i = 0; $i<$l; $i++)
    if (stripos($str, 'fox') !== false) {}


prof_flag("Start: preg_match");

for ($i = 0; $i<$l; $i++)
    if (preg_match('#fox#i', $str) === 1) {}

prof_flag("Finished");
prof_print();

Only value to this code is that it shows a cool way to record times things take to run lol

Ran 10,000,000.00 calls per command

Start: stripos
   2.217225
Start: preg_match
   3.788667
Start: ==
   0.511315
Start: ucwords lol
   2.112984
Finished