I am trying to find image in an image. I do this for desktop automation. At this moment, I'm trying to be fast, not precise. As such, I have decided to match similar image solely based on the same average color.
If I pick several icons on my desktop, for example:
And I will search for the last one (I'm still wondering what this file is):
You can clearly see what is most likely to be the match:
In different situations, this may not work. However when image size is given, it should be pretty reliable and lightning fast.
I can get a screenshot as BufferedImage
object:
MSWindow window = MSWindow.windowFromName("Firefox", false);
BufferedImage img = window.screenshot();
//Or, if I can estimate smaller region for searching:
BufferedImage img2 = window.screenshotCrop(20,20,50,50);
Of course, the image to search image will be loaded from template saved in a file:
BufferedImage img = ImageIO.read(...whatever goes in there, I'm still confused...);
I explained what all I know so that we can focus on the only problem:
Speed wins here. In this exceptional case, I consider it more valuable than code readability.
The typical approach to averaging RGB colors is to add up all the red, green, and blue values, and divide each by the number of pixels to get the components of the final color.
A BufferedImage is comprised of a ColorModel and a Raster of image data. The number and types of bands in the SampleModel of the Raster must match the number and types required by the ColorModel to represent its color and alpha components. All BufferedImage objects have an upper left corner coordinate of (0, 0).
I think that no matter what you do, you are going to have an O(wh)
operation, where w
is your width and h
is your height.
Therefore, I'm going to post this (naive) solution to fulfil the first part of your question as I do not believe there is a faster solution.
/*
* Where bi is your image, (x0,y0) is your upper left coordinate, and (w,h)
* are your width and height respectively
*/
public static Color averageColor(BufferedImage bi, int x0, int y0, int w,
int h) {
int x1 = x0 + w;
int y1 = y0 + h;
long sumr = 0, sumg = 0, sumb = 0;
for (int x = x0; x < x1; x++) {
for (int y = y0; y < y1; y++) {
Color pixel = new Color(bi.getRGB(x, y));
sumr += pixel.getRed();
sumg += pixel.getGreen();
sumb += pixel.getBlue();
}
}
int num = w * h;
return new Color(sumr / num, sumg / num, sumb / num);
}
There is a constant time method for finding the mean colour of a rectangular section of an image but it requires a linear preprocess. This should be fine in your case. This method can also be used to find the mean value of a rectangular prism in a 3d array or any higher dimensional analog of the problem. I will be using a gray scale example but this can be easily extended to 3 or more channels simply by repeating the process.
Lets say we have a 2 dimensional array of numbers we will call "img
".
The first step is to generate a new array of the same dimensions where each element contains the sum of all values in the original image that lie within the rectangle that bounds that element and the top left element of the image.
You can use the following method to construct such an image in linear time:
int width = 1920;
int height = 1080;
//source data
int[] img = GrayScaleScreenCapture();
int[] helperImg = int[width * height]
for(int y = 0; y < height; ++y)
{
for(int x = 0; x < width; ++x)
{
int total = img[y * width + x];
if(x > 0)
{
//Add value from the pixel to the left in helperImg
total += helperImg[y * width + (x - 1)];
}
if(y > 0)
{
//Add value from the pixel above in helperImg
total += helperImg[(y - 1) * width + x];
}
if(x > 0 && y > 0)
{
//Subtract value from the pixel above and to the left in helperImg
total -= helperImg[(y - 1) * width + (x - 1)];
}
helperImg[y * width + x] = total;
}
}
Now we can use helperImg
to find the total of all values within a given rectangle of img
in constant time:
//Some Rectangle with corners (x0, y0), (x1, y0) , (x0, y1), (x1, y1)
int x0 = 50;
int x1 = 150;
int y0 = 25;
int y1 = 200;
int totalOfRect = helperImg[y1 * width + x1];
if(x0 > 0)
{
totalOfRect -= helperImg[y1 * width + (x0 - 1)];
}
if(y0 > 0)
{
totalOfRect -= helperImg[(y0 - 1) * width + x1];
}
if(x0 > 0 && y0 > 0)
{
totalOfRect += helperImg[(y0 - 1) * width + (x0 - 1)];
}
Finally, we simply divide totalOfRect
by the area of the rectangle to get the mean value:
int rWidth = x1 - x0 + 1;
int rheight = y1 - y0 + 1;
int meanOfRect = totalOfRect / (rWidth * rHeight);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With