Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OCR: Difference between two frames

I am trying to find an easy solution to implement the OCR algorithm from OPenCV. I am very new to Image Processing ! I am playing a video that is decoded with specific codec using RLE algorithm.

What I would like to do is that for each decoded frame, I would like to compare it with the previous one and store the pixels that have changed between the two frames.

Most of the existing solutions gives a difference between the two frames but I would like to just keep the new pixels that have changed and store it in a table and then be able to analyze every group of pixels that have changed instead of analyzing the whole image each time.

I planned to use the "blobs detection" algoritm mais I'm stuck before being able to implement it.

Today, I'm trying this:

char *prevFrame;
char *curFrame;
QVector DiffPixel<LONG>;

//for each frame
DiffPixel.push_back(curFrame-prevFrame);

enter image description here

I really want to have the "Only changed pixel result" solution. Could anyone give me some tips or correct me if I'm going to a wrong way ?

EDIT:

New question, what if there are multiple areas of changed pixels ? Will it be possible to have one table per blocs of changed pixels or will it be only one unique table ? Take the example below:

Multiple Areas Pixels

The best thing as a result would be to have 2 mat matrices. The first matrix with the first orange square and the second matrix with the second orange square. This way, it avoids having to "scan" almost the entire frame if we store the result in one matrix only with a resolution being almost the same as the full frame.

The main goal here is to minimize the area (aka the resolution) to analyze to find text.

like image 679
Robert Jones Avatar asked Dec 01 '15 16:12

Robert Jones


People also ask

What is the difference between OCR and ICR?

ICR or Intelligent Character Recognition is basically an advanced form of OCR that processes Handwritten texts. ICR software extends the radar of OCR standard font recognition property and allows different styles and fonts of handwriting, and this is mainly what makes ICR different from OCR.

Does Office Lens do OCR?

Does Office Lens Do OCR? Office lens is one of Microsoft tool which has been here for a while now and you can feasibly use it to digitalize documents in your cabinet and modify it right on an iOS device, Mac, or PC. Office lens OCR uses the rear-facing camera on iPhone, iPad, or iPod touch to capture the image of any document.

What is Optical Character Recognition (OCR)?

Thus, resulting in an image copy of the original file but with the lack of properties like: Optical Character Recognition/Reader commonly abbreviated as OCR is a tool that electronically identifies and converts text or print document and convert them to a digital text document.

What is accurate OCR and how does it work?

Accurate OCR is a complicated process and no two OCR engines or products approach the problem exactly the same. This means, that while the OmniPage engine is extremely accurate, the characters that are misrecognized won't necessarily be the same characters missed by another product.


1 Answers

After loading your images:

img1

enter image description here

img2

enter image description here

you can apply XOR operation to get the differences. The result has the same number of channels of the input images:

XOR

enter image description here

You can then create a binary mask OR-ing all channels:

mask

enter image description here

The you can copy the values of img2 that correspond to non-zero elements in the mask to a white image:

diff

enter image description here


UPDATE

If you have multiple areas where pixel changed, like this:

enter image description here

You'll find a difference mask (after binarization all non-zero pixels are set to 255) like:

enter image description here

You can then extract connected components and draw each connected component on a new black-initialized mask:

enter image description here

Then, as before, you can copy the values of img2 that correspond to non-zero elements in each mask to a white image.

enter image description here

The complete code for reference. Note that this is the code for the updated version of the answer. You can find the original code in the revision history.

#include <opencv2\opencv.hpp>
#include <vector>
using namespace cv;
using namespace std;

int main()
{
    // Load the images
    Mat img1 = imread("path_to_img1");
    Mat img2 = imread("path_to_img2");

    imshow("Img1", img1);
    imshow("Img2", img2);

    // Apply XOR operation, results in a N = img1.channels() image
    Mat maskNch = (img1 ^ img2);

    imshow("XOR", maskNch);

    // Create a binary mask

    // Split each channel
    vector<Mat1b> masks;
    split(maskNch, masks);

    // Create a black mask
    Mat1b mask(maskNch.rows, maskNch.cols, uchar(0));

    // OR with each channel of the N channels mask
    for (int i = 0; i < masks.size(); ++i)
    {
        mask |= masks[i];
    }

    // Binarize mask
    mask = mask > 0;

    imshow("Mask", mask);

    // Find connected components
    vector<vector<Point>> contours;
    findContours(mask.clone(), contours, RETR_LIST, CHAIN_APPROX_SIMPLE);

    for (int i = 0; i < contours.size(); ++i)
    {
        // Create a black mask
        Mat1b mask_i(mask.rows, mask.cols, uchar(0));
        // Draw the i-th connected component
        drawContours(mask_i, contours, i, Scalar(255), CV_FILLED);

        // Create a black image
        Mat diff_i(img2.rows, img2.cols, img2.type());
        diff_i.setTo(255);

        // Copy into diff only different pixels
        img2.copyTo(diff_i, mask_i);

        imshow("Mask " + to_string(i), mask_i);
        imshow("Diff " + to_string(i), diff_i);
    }

    waitKey();
    return 0;
}
like image 177
Miki Avatar answered Oct 05 '22 23:10

Miki