Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Determine position of one image in another with PHP

I have two images(small and big). One of them contains another one. Something like one image is a photo and another one is a picture of the page of the photoalbum where this photo is situated. I hope you understood what I said.

So how do I get coordinates (x,y) of a small image on the big one using PHP?

like image 895
Pigalev Pavel Avatar asked Dec 11 '22 18:12

Pigalev Pavel


2 Answers

It is quite easy to do on your own, without relying on external libs other than gd.

What you need to be aware of, is that you most likely cannot do a simple pixel per pixel check, as filtering and compression might slightly modify the value of each pixel.

The code I am proposing here will most likely be slow, if performance is a concern, you could optimize it or take shortcuts. Hopefully, the code puts you on the right track!

First, lets iterate on our pictures

$small = imagecreatefrompng("small.png");
$large = imagecreatefrompng("large.png");

$smallwidth = imagesx($small);
$smallheight = imagesy($small);

$largewidth = imagesx($large);
$largeheight = imagesy($large);

$foundX = -1;
$foundY = -1;

$keepThreshold = 20;

$potentialPositions = array();

for($x = 0; $x <= $largewidth - $smallwidth; ++$x)
{
    for($y = 0; $y <= $largeheight - $smallheight; ++$y)
    {
        // Scan the whole picture
        $error = GetImageErrorAt($large, $small, $x, $y);
        if($error["avg"] < $keepThreshold)
        {
            array_push($potentialPositions, array("x" => $x, "y" => $y, "error" => $error));
        }
    }
}

imagedestroy($small);
imagedestroy($large);

echo "Found " . count($potentialPositions) . " potential positions\n";

The goal here is to find how similar the pixels are, and if they are somewhat similar, keep the potential position. Here, I iterate each and every pixel of the large picture, this could be a point of optimization.

Now, where does this error come from?

Getting the likeliness

What I did here is iterate over the small picture and a "window" in the large picture checking how much difference there was on the red, green and blue channel:

function GetImageErrorAt($haystack, $needle, $startX, $startY)
{
    $error = array("red" => 0, "green" => 0, "blue" => 0, "avg" => 0);
    $needleWidth = imagesx($needle);
    $needleHeight = imagesy($needle);

    for($x = 0; $x < $needleWidth; ++$x)
    {
        for($y = 0; $y < $needleHeight; ++$y)
        {
            $nrgb = imagecolorat($needle, $x, $y);
            $hrgb = imagecolorat($haystack, $x + $startX, $y + $startY);

            $nr = $nrgb & 0xFF;
            $hr = $hrgb & 0xFF;

            $error["red"] += abs($hr - $nr);

            $ng = ($nrgb >> 8) & 0xFF;
            $hg = ($hrgb >> 8) & 0xFF;

            $error["green"] += abs($hg - $ng);

            $nb = ($nrgb >> 16) & 0xFF;
            $hb = ($hrgb >> 16) & 0xFF;

            $error["blue"] += abs($hb - $nb);
        }
    }
    $error["avg"] = ($error["red"] + $error["green"] + $error["blue"]) / ($needleWidth * $needleHeight);
    return $error;
}

So far, we've established a potential error value for every "window" in the large picture that could contain the small picture, and store them in an array if they seem "good enough".

Sorting

Now, we simply need to sort our best matches and keep the best one, it is most likely where our small picture is located:

function SortOnAvgError($a, $b)
{
    if($a["error"]["avg"] == $b["error"]["avg"])
    {
        return 0;
    }
    return ($a["error"]["avg"] < $b["error"]["avg"]) ? -1 : 1;
}

if(count($potentialPositions) > 0)
{
    usort($potentialPositions, "SortOnAvgError");
    $mostLikely = $potentialPositions[0];
    echo "Most likely at " . $mostLikely["x"] . "," . $mostLikely["y"];
}

Example

Given the two following pictures:

Large

and

Small

You should have the following result:

Found 5 potential positions
Most likely at 288,235

Which corresponds exactly with the position of our duck. The 4 other positions are 1 pixel up, down, left and right.

I am going to edit this entry after I'm done working on some optimizations for you, as this code is way too slow for big images (PHP performed even worse than I expected).

Edit

First, before doing anything to "optimize" the code, we need numbers, so I added

function microtime_float()
{
    list($usec, $sec) = explode(" ", microtime());
    return ((float)$usec + (float)$sec);
}

$time_start = microtime_float();

and

$time_end = microtime_float();
echo "in " . ($time_end - $time_start) . " seconds\n";

at the end to have a specific idea of how much time is taken during the algorithm. This way, I can know if my changes improve or make the code worse. Given that the current code with these pictures takes ~45 minutes to execute, we should be able to improve this time quite a lot.

A tentative that was not succesful, was to cache the RGB from the $needle to try to accelerate the GetImageErrorAt function, but it worsened the time.

Given that our computation is on a geometric scale, the more pixels we explore, the longer it will take... so a solution is to skip many pixels to try to locate as fast as possible our picture, and then more accurately zone in on our position.

I modified the error function to take as a parameter how to increment the x and y

function GetImageErrorAt($haystack, $needle, $startX, $startY, $increment)
{
    $needleWidth = imagesx($needle);
    $needleHeight = imagesy($needle);

    $error = array("red" => 0, "green" => 0, "blue" => 0, "avg" => 0, "complete" => true);

    for($x = 0; $x < $needleWidth; $x = $x + $increment)
    {
        for($y = 0; $y < $needleHeight; $y = $y + $increment)
        {
            $hrgb = imagecolorat($haystack, $x + $startX, $y + $startY);
            $nrgb = imagecolorat($needle, $x, $y);

            $nr = $nrgb & 0xFF;
            $hr = $hrgb & 0xFF;

            $ng = ($nrgb >> 8) & 0xFF;
            $hg = ($hrgb >> 8) & 0xFF;

            $nb = ($nrgb >> 16) & 0xFF;
            $hb = ($hrgb >> 16) & 0xFF;

            $error["red"] += abs($hr - $nr);
            $error["green"] += abs($hg - $ng);
            $error["blue"] += abs($hb - $nb);
        }
    }

    $error["avg"] = ($error["red"] + $error["green"] + $error["blue"]) / ($needleWidth * $needleHeight);

    return $error;
}

For example, passing 2 will make the function return 4 times faster, as we skip both x and y values.

I also added a stepSize for the main loop:

$stepSize = 10;

for($x = 0; $x <= $largewidth - $smallwidth; $x = $x + $stepSize)
{
    for($y = 0; $y <= $largeheight - $smallheight; $y = $y + $stepSize)
    {
        // Scan the whole picture
        $error = GetImageErrorAt($large, $small, $x, $y, 2);
        if($error["complete"] == true && $error["avg"] < $keepThreshold)
        {
            array_push($potentialPositions, array("x" => $x, "y" => $y, "error" => $error));
        }
    }
}

Doing this, I was able to reduce the execution time from 2657 seconds to 7 seconds at a price of precision. I increased the keepThreshold to have more "potential results".

Now that I wasn't checking each pixels, my best answer is:

Found 8 potential positions
Most likely at 290,240

As you can see, we're near our desired position, but it's not quite right.

What I'm going to do next is define a rectangle around this "pretty close" position to explore every pixel inside the stepSize we added.

I'm now changing the lower part of the script for:

if(count($potentialPositions) > 0)
{
    usort($potentialPositions, "SortOnAvgError");
    $mostLikely = $potentialPositions[0];
    echo "Most probably around " . $mostLikely["x"] . "," . $mostLikely["y"] . "\n";

    $startX = $mostLikely["x"] - $stepSize + 1; // - $stepSize was already explored
    $startY = $mostLikely["y"] - $stepSize + 1; // - $stepSize was already explored

    $endX = $mostLikely["x"] + $stepSize - 1;
    $endY = $mostLikely["y"] + $stepSize - 1;

    $refinedPositions = array();

    for($x = $startX; $x <= $endX; ++$x)
    {
        for($y = $startY; $y <= $endY; ++$y)
        {
            // Scan the whole picture
            $error = GetImageErrorAt($large, $small, $x, $y, 1); // now check every pixel!
            if($error["avg"] < $keepThreshold) // make the threshold smaller
            {
                array_push($refinedPositions, array("x" => $x, "y" => $y, "error" => $error));
            }
        }
    }

    echo "Found " . count($refinedPositions) . " refined positions\n";
    if(count($refinedPositions))
    {
        usort($refinedPositions, "SortOnAvgError");
        $mostLikely = $refinedPositions[0];
        echo "Most likely at " . $mostLikely["x"] . "," . $mostLikely["y"] . "\n";
    }
}

Which now gives me an output like:

Found 8 potential positions
Most probably around 290,240
Checking between X 281 and 299
Checking between Y 231 and 249
Found 23 refined positions
Most likely at 288,235
in 13.960182189941 seconds

Which is indeed the right answer, roughly 200 times faster than the initial script.

Edit 2

Now, my test case was a bit too simple... I changed it to a google image search:

Google Image Search

Looking for this picture (it's located at 718,432)

The Duck

Considering the bigger picture sizes, we can expect a longer processing time, but the algorithm did find the picture at the right position:

Found 123 potential positions
Most probably around 720,430
Found 17 refined positions
Most likely at 718,432
in 43.224536895752 seconds

Edit 3

I decided to try the option I told you in the comment, to scale down the pictures before executing the find, and I had great results with it.

I added this code before the first loop:

$smallresizedwidth = $smallwidth / 2;
$smallresizedheight = $smallheight / 2;

$largeresizedwidth = $largewidth / 2;
$largeresizedheight = $largeheight / 2;

$smallresized = imagecreatetruecolor($smallresizedwidth, $smallresizedheight);
$largeresized = imagecreatetruecolor($largeresizedwidth, $largeresizedheight);

imagecopyresized($smallresized, $small, 0, 0, 0, 0, $smallresizedwidth, $smallresizedheight, $smallwidth, $smallheight);
imagecopyresized($largeresized, $large, 0, 0, 0, 0, $largeresizedwidth, $largeresizedheight, $largewidth, $largeheight);

And for them main loop I iterated on the resized assets with the resized width and height. Then, when adding to the array, I double the x and y, giving the following:

array_push($potentialPositions, array("x" => $x * 2, "y" => $y * 2, "error" => $error));

The rest of the code remains the same, as we want to do the precise location on the real size pictures. All you have to do is add at the end:

imagedestroy($smallresized);
imagedestroy($largeresized);

Using this version of the code, with the google image result, I had:

Found 18 potential positions
Most around 720,440
Found 17 refined positions
Most likely at 718,432
in 11.499078989029 seconds

A 4 times performance increase!

Hope this helps

like image 61
emartel Avatar answered Dec 25 '22 12:12

emartel


Use ImageMagick.

This page will give you answer: How can I detect / calculate if a small pictures is present inside a bigger picture?

like image 45
Jirilmon Avatar answered Dec 25 '22 12:12

Jirilmon