Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm to detect overlapping rows of two images

Let's say I have 2 images A and B as below.

enter image description here

Notice that the bottom of A overlaps with the top of B for n rows of pixels, denoted by the two red rectangles. A and B have the same number of columns but might have different number of rows.

Two questions:

  • Given A and B, how to determine n efficiently?
  • If B is somehow changed in a way that 30%-50% of its pixels are completely replaced (for example, imagine the top left area showing # of votes/answers/views is replaced with an ad banner). How to determine n?

If anyone can point to an algorithm or better yet, an implementation in any language (preferred C/C++, C#, Java and JavaScript), it is much appreciated.

like image 774
Buu Nguyen Avatar asked Apr 25 '13 15:04

Buu Nguyen


3 Answers

If I understood correctly, you probably want to look at normalized cross correlation of greyscale versions of the two images. Where you have large images, or large overlapping regions, this is done most efficiently in the frequency domain using the FFTs of the images (or overlap areas) and is called phase correlation.

The basic steps I would take in your situation are as follows:

  1. Extract the bottom half of the first image and the top half of the second image.
  2. Convert both image patches to greyscale.
  3. Perform FFT on each image patch (there are some details here relating to windowing and padding).
  4. Calculate the complex conjugate of the two FFTs (same as correlation in spatial domain).
  5. Do inverse FFT on the result.
  6. Find the peak in the above to get the XY shift that best aligns the two images.

Having found the relative offset between the top and bottom image patches, you can easily calculate n as you required.

If you want to experiment without having to code the above from scratch, OpenCV has a number of functions for template matching, which you can easily try. See here for details.

If part of either image has been changed - e.g. by a banner ad - the above procedure still gives the best match, and the magnitude of the peak you find in step 6 gives an indication of the match "confidence" - so you can get a rough idea of how similar the two areas are.

like image 166
Roger Rowland Avatar answered Oct 07 '22 17:10

Roger Rowland


I had a little play at doing this with ImageMagick. Here is the animation of what I did, and the explanation and code follow.

enter image description here

First I grabbed a couple of StackOverflow pages, using webkit2png, calling them a.png and b.png.

Then I cropped a rectangle out of the top-left of b.png and a column the same width, but the full height out of a.png

That gave me this:

enter image description here

and this

enter image description here

I now overlay the smaller rectangle from the second page onto the bottom of the strip from the first page. I then calculate the difference between the two images by subtracting one from the other and note that when the difference is zero, the pictures must be the same, and the output image will be black, so I have found the point at which they overlap.

Here is the code:

#!/bin/bash
# Grab page 2 as "A" and page 3 as "B"
# webkit2png -F -o A http://stackoverflow.com/questions?page=2&sort=newest
# webkit2png -F -o B http://stackoverflow.com/questions?page=3&sort=newest

BLOBH=256  # blob height
BLOBW=256  # blob width

# Get height of x.png
XHEIGHT=$(identify -format "%h" x.png)

# Crop a column 256 pixels out of a.png that doesn't contain adverts or junk, into x.png
convert a.png -crop ${BLOBW}x+0+0 x.png

# Crop a rectangle 256x256 pixels out of top left corner of b.png, into y.png
convert b.png -crop ${BLOBW}x${BLOBH}+0+0 y.png

# Now slide y.png up across x.png, starting at the bottom of x.png
# ... differencing the two images as we go
# ... stop when the difference is nothing, i.e. they are the same and difference is black image
lines=0
while :; do
   OFFSET=$((XHEIGHT-BLOBH-1-lines))
   if [ $OFFSET -lt 0 ]; then exit; fi
   FN=$(printf "out-%04d.png" $lines)
   diff=$(convert x.png -crop ${BLOBW}x${BLOBH}+0+${OFFSET} +repage \
           y.png \
           -fuzz 5% -compose difference -composite +write $FN \
           \( +clone -evaluate set 0 \) -metric AE -compare -format "%[distortion]" info:)
   echo $diff:$lines
   ((lines++))
done
n=$((BLOBH+lines))
like image 28
Mark Setchell Avatar answered Oct 07 '22 19:10

Mark Setchell


The FFT solution might be more complex than you were hoping for. For a general problem, that might be the only robust way.

For a simple solution, you need to start making assumptions. For example, can you guarantee that the columns of the images line up (barring the noted changes)? This allows you to go down the path suggested by @n.m.

Can you cut the image into vertical strips, and consider a row matches if a sufficient proportion of the strips match?

[ This could be redone to use a few passes with difference column offsets if we need to be robust to that.]

This gives something like:

class Image
{
public:
    virtual ~Image() {}
    typedef int Pixel;
    virtual Pixel* getRow(int rowId) const = 0;
    virtual int getWidth() const = 0;
    virtual int getHeight() const = 0;
};

class Analyser
{
    Analyser(const Image& a, const Image& b)
        : a_(a), b_(b) {}
    typedef Image::Pixel* Section;
    static const int numStrips = 16;
    struct StripId
    {
        StripId(int r = 0, int c = 0)
            : row_(r), strip_(c)
        {}
        int row_;
        int strip_;
    };
    typedef std::unordered_map<unsigned, StripId> StripTable;
    int numberOfOverlappingRows()
    {
        int commonWidth = std::min(a_.getWidth(), b_.getWidth());
        int stripWidth = commonWidth/numStrips;
        StripTable aHash;
        createStripTable(aHash, a_, stripWidth);
        StripTable bHash;
        createStripTable(bHash, b_, stripWidth);
        // This is the position that the bottom row of A appears in B.
        int bottomOfA = 0;
        bool canFindBottomOfAInB = canFindLine(a_.getRow(a_.getHeight() - 1), bHash, stripWidth,  bottomOfA);
        int topOfB= 0;
        bool canFindTopOfBInA =  canFindLine(b_.getRow(0), aHash, stripWidth, topOfB);
        int topOFBfromBottomOfA = a_.getHeight() - topOfB;
        // Expect topOFBfromBottomOfA == bottomOfA
        return bottomOfA;
    }
    bool canFindLine(Image::Pixel* source, StripTable& target, int stripWidth, int& matchingRow)
    {
        Image::Pixel* strip = source;
        std::map<int, int> matchedRows;
        for(int index = 0; index < stripWidth; ++index)
        {
            Image::Pixel hashValue = getHashOfStrip(strip,stripWidth);      
            bool match =  target.count(hashValue) > 0;
            if (match)
            {
                ++matchedRows[target[hashValue].row_];
            }
            strip += stripWidth;
        }
        // Can set a threshold requiring more matches than 0
        if (matchedRows.size() == 0)
            return false;
        // FIXME return the most matched row.
        matchingRow = matchedRows.begin()->first;
        return true; 
    }
    Image::Pixel* getStrip(const Image& im, int row, int stripId, int stripWidth)
    {
        return im.getRow(row) + stripId * stripWidth;
    }
    static Image::Pixel getHashOfStrip(Image::Pixel* strip, unsigned width)
    {
        Image::Pixel hashValue = 0;
        for(unsigned col = 0; col < width; ++col)
        {
            hashValue |= *(strip + col);
        }
    }
    void createStripTable(StripTable& hash, const Image& image, int stripWidth)
    {
        for(int row = 0; row < image.getHeight(); ++row)
        {
            for(int index = 0; index < stripWidth; ++index)
            {
                // Warning: Not this simple!
                // If images are sourced from lossy intermediate and hence pixels not _exactly_ the same, need some kind of fuzzy equality here.
                // Details are going to depend on the image format etc, but this is the gist.
                Image::Pixel* strip = getStrip(image, row, index, stripWidth);
                Image::Pixel hashValue = getHashOfStrip(strip,stripWidth);      
                hash[hashValue] = StripId(row, index);
            }
        }
    }

    const Image& a_;
    const Image& b_;

};
like image 42
Keith Avatar answered Oct 07 '22 18:10

Keith