Let's say I have 2 images A and B as below.
Notice that the bottom of A overlaps with the top of B for n
rows of pixels, denoted by the two red rectangles. A and B have the same number of columns but might have different number of rows.
Two questions:
n
efficiently? n
?If anyone can point to an algorithm or better yet, an implementation in any language (preferred C/C++, C#, Java and JavaScript), it is much appreciated.
If I understood correctly, you probably want to look at normalized cross correlation of greyscale versions of the two images. Where you have large images, or large overlapping regions, this is done most efficiently in the frequency domain using the FFTs of the images (or overlap areas) and is called phase correlation.
The basic steps I would take in your situation are as follows:
Having found the relative offset between the top and bottom image patches, you can easily calculate n as you required.
If you want to experiment without having to code the above from scratch, OpenCV has a number of functions for template matching, which you can easily try. See here for details.
If part of either image has been changed - e.g. by a banner ad - the above procedure still gives the best match, and the magnitude of the peak you find in step 6 gives an indication of the match "confidence" - so you can get a rough idea of how similar the two areas are.
I had a little play at doing this with ImageMagick. Here is the animation of what I did, and the explanation and code follow.
First I grabbed a couple of StackOverflow pages, using webkit2png
, calling them a.png
and b.png
.
Then I cropped a rectangle out of the top-left of b.png
and a column the same width, but the full height out of a.png
That gave me this:
and this
I now overlay the smaller rectangle from the second page onto the bottom of the strip from the first page. I then calculate the difference between the two images by subtracting one from the other and note that when the difference is zero, the pictures must be the same, and the output image will be black, so I have found the point at which they overlap.
Here is the code:
#!/bin/bash
# Grab page 2 as "A" and page 3 as "B"
# webkit2png -F -o A http://stackoverflow.com/questions?page=2&sort=newest
# webkit2png -F -o B http://stackoverflow.com/questions?page=3&sort=newest
BLOBH=256 # blob height
BLOBW=256 # blob width
# Get height of x.png
XHEIGHT=$(identify -format "%h" x.png)
# Crop a column 256 pixels out of a.png that doesn't contain adverts or junk, into x.png
convert a.png -crop ${BLOBW}x+0+0 x.png
# Crop a rectangle 256x256 pixels out of top left corner of b.png, into y.png
convert b.png -crop ${BLOBW}x${BLOBH}+0+0 y.png
# Now slide y.png up across x.png, starting at the bottom of x.png
# ... differencing the two images as we go
# ... stop when the difference is nothing, i.e. they are the same and difference is black image
lines=0
while :; do
OFFSET=$((XHEIGHT-BLOBH-1-lines))
if [ $OFFSET -lt 0 ]; then exit; fi
FN=$(printf "out-%04d.png" $lines)
diff=$(convert x.png -crop ${BLOBW}x${BLOBH}+0+${OFFSET} +repage \
y.png \
-fuzz 5% -compose difference -composite +write $FN \
\( +clone -evaluate set 0 \) -metric AE -compare -format "%[distortion]" info:)
echo $diff:$lines
((lines++))
done
n=$((BLOBH+lines))
The FFT solution might be more complex than you were hoping for. For a general problem, that might be the only robust way.
For a simple solution, you need to start making assumptions. For example, can you guarantee that the columns of the images line up (barring the noted changes)? This allows you to go down the path suggested by @n.m.
Can you cut the image into vertical strips, and consider a row matches if a sufficient proportion of the strips match?
[ This could be redone to use a few passes with difference column offsets if we need to be robust to that.]
This gives something like:
class Image
{
public:
virtual ~Image() {}
typedef int Pixel;
virtual Pixel* getRow(int rowId) const = 0;
virtual int getWidth() const = 0;
virtual int getHeight() const = 0;
};
class Analyser
{
Analyser(const Image& a, const Image& b)
: a_(a), b_(b) {}
typedef Image::Pixel* Section;
static const int numStrips = 16;
struct StripId
{
StripId(int r = 0, int c = 0)
: row_(r), strip_(c)
{}
int row_;
int strip_;
};
typedef std::unordered_map<unsigned, StripId> StripTable;
int numberOfOverlappingRows()
{
int commonWidth = std::min(a_.getWidth(), b_.getWidth());
int stripWidth = commonWidth/numStrips;
StripTable aHash;
createStripTable(aHash, a_, stripWidth);
StripTable bHash;
createStripTable(bHash, b_, stripWidth);
// This is the position that the bottom row of A appears in B.
int bottomOfA = 0;
bool canFindBottomOfAInB = canFindLine(a_.getRow(a_.getHeight() - 1), bHash, stripWidth, bottomOfA);
int topOfB= 0;
bool canFindTopOfBInA = canFindLine(b_.getRow(0), aHash, stripWidth, topOfB);
int topOFBfromBottomOfA = a_.getHeight() - topOfB;
// Expect topOFBfromBottomOfA == bottomOfA
return bottomOfA;
}
bool canFindLine(Image::Pixel* source, StripTable& target, int stripWidth, int& matchingRow)
{
Image::Pixel* strip = source;
std::map<int, int> matchedRows;
for(int index = 0; index < stripWidth; ++index)
{
Image::Pixel hashValue = getHashOfStrip(strip,stripWidth);
bool match = target.count(hashValue) > 0;
if (match)
{
++matchedRows[target[hashValue].row_];
}
strip += stripWidth;
}
// Can set a threshold requiring more matches than 0
if (matchedRows.size() == 0)
return false;
// FIXME return the most matched row.
matchingRow = matchedRows.begin()->first;
return true;
}
Image::Pixel* getStrip(const Image& im, int row, int stripId, int stripWidth)
{
return im.getRow(row) + stripId * stripWidth;
}
static Image::Pixel getHashOfStrip(Image::Pixel* strip, unsigned width)
{
Image::Pixel hashValue = 0;
for(unsigned col = 0; col < width; ++col)
{
hashValue |= *(strip + col);
}
}
void createStripTable(StripTable& hash, const Image& image, int stripWidth)
{
for(int row = 0; row < image.getHeight(); ++row)
{
for(int index = 0; index < stripWidth; ++index)
{
// Warning: Not this simple!
// If images are sourced from lossy intermediate and hence pixels not _exactly_ the same, need some kind of fuzzy equality here.
// Details are going to depend on the image format etc, but this is the gist.
Image::Pixel* strip = getStrip(image, row, index, stripWidth);
Image::Pixel hashValue = getHashOfStrip(strip,stripWidth);
hash[hashValue] = StripId(row, index);
}
}
}
const Image& a_;
const Image& b_;
};
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With