Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding a picture in a picture with java?

what i want to to is analyse input from screen in form of pictures. I want to be able to identify a part of an image in a bigger image and get its coordinates within the bigger picture. Example:

box

would have to be located in

big picture

And the result would be the upper right corner of the picture in the big picture and the lower left of the part in the big picture. As you can see, the white part of the picture is irrelevant, what i basically need is just the green frame. Is there a library that can do something like this for me? Runtime is not really an issue.

What i want to do with this is just generating a few random pixel coordinates and recognize the color in the big picture at that position, to recognize the green box fast later. And how would it decrease performance, if the white box in the middle is transparent?

The question has been asked several times on SO as it seems without a single answer. I found i found a solution at http://werner.yellowcouch.org/Papers/subimg/index.html . Unfortunately its in C++ and i do not understand a thing. Would be nice to have a Java implementation on SO.

like image 317
tarrasch Avatar asked Sep 26 '12 09:09

tarrasch


People also ask

Can we read image in Java?

Java implements a particular type of object called a BufferedImage for images in Java. A BufferedImage can be read from several distinct image types (i.e., BMP, HEIC, etc.).

What data type is a picture in Java?

By default, Java supports only these five formats for images: JPEG, PNG, BMP, WEBMP, GIF. If we attempt to work with an image file in a different format, our application will not be able to read it and will throw a NullPointerException when accessing the BufferedImage variable.


2 Answers

The problem is hard to answer in general because people often have different requirements for what counts as an image match. Some people might want to search for an image that might have a different size or orientation than the template image they provide, in which case a scale- or rotation-invariant approach is needed. There are various option such as looking for similar textures, features or shapes, but I will focus on approaches that only looks for pixels of a similar colour that are in the exact same positions as the template image. This seems most suited to your example which seems to fall in the category of template matching.

Possible approaches

In this case the problem is closely related to the signal processing concepts of cross-correlation and convolution, which is often implemented using an FFT as it is very fast (its in the name!). This is what was used in the approach you linked to, and the FFTW library could be of use when attempting such an implementation as it has wrappers for Java. Using cross-correlation works quite well, as seen in this question, as well as the famous waldo question.

Another option is not to use all the pixels for comparison, but rather only the features that are easier to find and more likely to be unique. This would require a feature descriptor like SIFT, SURF or one of many others. You would need to find all the features in both images and then look for features that have similar positions to those in the template image. With this approach I suggest you use JavaCV.

The random guessing approach you mentioned should work fast when it is possible, but unfortunately it isn't generally applicable as it will only be useful with certain image combinations that produce a close match near the correct location.

Unless you use an external library, the simplest method in Java would be what I would call a brute-force approach, although it is a bit slow. The brute-force approach simply involves searching the whole image for the sub-region that best matches the image you are looking for. I'll explain this approach further. First you need to define how to determine the similarity between two equally sized images. This can be done by summing the differences between the pixel's colours which requires a definition for the difference between RGB values.

Colour similarity

One way of determining the difference between two RGB values is to use the euclidean distance:

sqrt( (r1-r2)^2 + (g1-g2)^2 + (b1-b2)^2 )

There are different colour spaces than RGB that can be used, but since your sub-image is most likely near identical (instead of just visually similar), this should work fine. If you have an ARGB colour space and you don't want semi-transparent pixels to influence your results as much, you can use:

a1 * a2 * sqrt( (r1-r2)^2 + (g1-g2)^2 + (b1-b2)^2 )

which will give a smaller value if the colours have transparency (assuming a1 and a2 are between 0 and 1). I would suggest that you use transparency instead of white areas and to use the PNG file format since it doesn't use lossy compression that subtly distorts the colours in the image.

Comparing images

To compare equally-sized images you can sum the difference between their individual pixels. This sum is then a measure of the difference and you can search for the region in the image with the lowest difference measure. It becomes harder if you don't even know whether the image contains the sub-image, but this would be indicated by the best match having a high difference measure. If you want, you could also normalize the difference measure to lie between 0 and 1 by dividing it by the size of the sub-image and the maximum possible RGB difference (sqrt(3) with the euclidean distance and RGB values from 0 to 1). Zero would then be an identical match and anything close to one would be as different as possible.

Brute-force implementation

Here's a simple implementation that uses the brute-force approach to search the image. With your example images it found the location at (139,55) to be the top-left location of the region with the best match (which looks correct). It took about 10 to 15 seconds to run on my PC and the normalized difference measure of the location was around 0.57.

 /**  * Finds the a region in one image that best matches another, smaller, image.  */  public static int[] findSubimage(BufferedImage im1, BufferedImage im2){    int w1 = im1.getWidth(); int h1 = im1.getHeight();    int w2 = im2.getWidth(); int h2 = im2.getHeight();    assert(w2 <= w1 && h2 <= h1);    // will keep track of best position found    int bestX = 0; int bestY = 0; double lowestDiff = Double.POSITIVE_INFINITY;    // brute-force search through whole image (slow...)    for(int x = 0;x < w1-w2;x++){      for(int y = 0;y < h1-h2;y++){        double comp = compareImages(im1.getSubimage(x,y,w2,h2),im2);        if(comp < lowestDiff){          bestX = x; bestY = y; lowestDiff = comp;        }      }    }    // output similarity measure from 0 to 1, with 0 being identical    System.out.println(lowestDiff);    // return best location    return new int[]{bestX,bestY};  }   /**  * Determines how different two identically sized regions are.  */  public static double compareImages(BufferedImage im1, BufferedImage im2){    assert(im1.getHeight() == im2.getHeight() && im1.getWidth() == im2.getWidth());    double variation = 0.0;    for(int x = 0;x < im1.getWidth();x++){      for(int y = 0;y < im1.getHeight();y++){         variation += compareARGB(im1.getRGB(x,y),im2.getRGB(x,y))/Math.sqrt(3);      }    }    return variation/(im1.getWidth()*im1.getHeight());  }   /**  * Calculates the difference between two ARGB colours (BufferedImage.TYPE_INT_ARGB).  */  public static double compareARGB(int rgb1, int rgb2){    double r1 = ((rgb1 >> 16) & 0xFF)/255.0; double r2 = ((rgb2 >> 16) & 0xFF)/255.0;    double g1 = ((rgb1 >> 8) & 0xFF)/255.0;  double g2 = ((rgb2 >> 8) & 0xFF)/255.0;    double b1 = (rgb1 & 0xFF)/255.0;         double b2 = (rgb2 & 0xFF)/255.0;    double a1 = ((rgb1 >> 24) & 0xFF)/255.0; double a2 = ((rgb2 >> 24) & 0xFF)/255.0;    // if there is transparency, the alpha values will make difference smaller    return a1*a2*Math.sqrt((r1-r2)*(r1-r2) + (g1-g2)*(g1-g2) + (b1-b2)*(b1-b2));  } 

I haven't looked, but maybe one of these Java image processing libraries could also be of some use:

  • Marvin project
  • ImageJ

If speed is really important I think the best approach would be an implementation using cross-correlation or feature descriptors that uses an external library.

like image 124
Silveri Avatar answered Oct 05 '22 22:10

Silveri


What you want is to find picture block by mask/boundaries.

It can be done without external library. On low level, every picture is a matrix of numbers, your mask is the matrix of numbers also. You can just linear scan big matrix and find the area that follows rules defined by your mask.

Example:

Big matrix:

1 0 1 1 1 1  0 1 0 1 0 0 0 0 0 1 1 1 0 1 1 0 0 0 

Mask:

1 1 1 1 0 0 1 1 1 

Applies this algorithm you detect a matched block in big matrix in upper right corner, that gives you start/end matrix indices and you can calculate these values in pixels.

In real problem you will have not a number set [0, 1] but much bigger - byte for example ([0, 256]). To make algorithm work better, matched means not exact number matching but possible with some deviations +-5 or something like this.

like image 24
mishadoff Avatar answered Oct 05 '22 23:10

mishadoff