Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

rectangle document detection using hough transform opencv android

I am trying to detect rectangle document using opencv 4 android sdk . First i tried to detect it by finding contours but it is not working with multi color documents.You can check this link to get better idea : detecting multi color document with OpenCV4Android

I researched a lot and found that it can be done using houghline transform.So i followed following way to detect document:

original image -> cvtColor -> GaussianBlur filter -> dilate it to sharpen edges -> applied watershed image segmentation algorithm -> canny edge detection with dynamic otsu's threshold -> then applied hough line transform

what i did for hough line transform is:

Imgproc.HoughLinesP(watershedMat, lines, 1, Math.PI / 180, 50, 100, 50);

    List<Line> horizontals = new ArrayList<>();
    List<Line> verticals = new ArrayList<>();
    for (int x = 0; x < lines.rows(); x++)
    {
        double[] vec = lines.get(x, 0);
        double x1 = vec[0],
                y1 = vec[1],
                x2 = vec[2],
                y2 = vec[3];
        Point start = new Point(x1, y1);
        Point end = new Point(x2, y2);
        Line line = new Line(start, end);
        if (Math.abs(x1 - x2) > Math.abs(y1-y2)) {
            horizontals.add(line);
        } else if (Math.abs(x2 - x1) < Math.abs(y2 - y1)){
            verticals.add(line);
        }
    }

and from above list of horizontal and vertical lines , i am finding intersection points as below:

protected Point computeIntersection (Line l1, Line l2) {
    double x1 = l1._p1.x, x2= l1._p2.x, y1 = l1._p1.y, y2 = l1._p2.y;
    double x3 = l2._p1.x, x4 = l2._p2.x, y3 = l2._p1.y, y4 = l2._p2.y;
    double d = (x1 - x2) * (y3 - y4) - (y1 - y2) * (x3 - x4);

   // double angle = angleBetween2Lines(l1,l2);
    Log.e("houghline","angle between 2 lines = "+angle);
    Point pt = new Point();
    pt.x = ((x1 * y2 - y1 * x2) * (x3 - x4) - (x1 - x2) * (x3 * y4 - y3 * x4)) / d;
    pt.y = ((x1 * y2 - y1 * x2) * (y3 - y4) - (y1 - y2) * (x3 * y4 - y3 * x4)) / d;


  return pt;
}

and from that four intersection points i am drawing lines . So, far i am able to detect document through it . see below image :

enter image description here

but , when other objects are concernced with the document ,it tries to detect them also. i am going top to down rows and left to right cols to find intersections of the largest rectangle . I am getting following issues :

enter image description here enter image description here

As you can see in above images , when other object comes on the screen it is going to detect it too. How to detect only document ? and ignore other objects ? Here is my original image :

enter image description here

Any help will be highly appreciated !! thanks in advance

like image 457
ImLearning Avatar asked Jun 29 '17 12:06

ImLearning


1 Answers

General infos

  • I'm using OpenCV 3.2.0 on Windows 10, however all of the mentioned functionality should be available in 2.4 and or Android.
  • I've resized the image for better visualization. This does not affect the current approach to solve the problem, however if we were to use some sort of edge detection, we absolutely should use the orignial image size.
  • The current solution provided uses a lot of custom functiality (LAB color detection, contour size analysis, etc.) which cannot be pulished here. If you need help for specific areas you can of course ask for help in the comments.

General observation of the problem

There are several reasons why your previous approaches did not work. Before we get to the solution, here are some observations that need to be considered:

  • You have an object that contains both darker and brighter elements compared to the background.
  • You have an object that consists of rather distinct parts regarding both brightness and color, as well es general homogeneity. In fact, the object is split by a section that looks a lot like the background.
  • You have background objects that are clearly distinguishable from the general background (e.g. the black object in the upper right corner).
  • The object is often captured from a slightly tilted perspective. This causes a perspective transformation of the otherwise rectangular object.

Solution

Considering the obove mentioned observations, I don't think simple thresholding or edge detection will yield any reliable results, especially when looking at the variations between the different images of the same scene. As a solution, I'd propose foreground and/or background color detection and classification via LAB or HSV color space. Sample images of the most prominent colors should be used to classify the respective areas. E.g. for the foreground the dark and bright red as well as the gold/yellowish color of the book. The background consists of a rather homogeneous grayish color which can be used for its detection. Potential algorithm:

  1. Detect and classify fore- and background according to LAB color space. Use a sensible color distance threshold (for me something around 8-10% worked in LAB space - AB space might work for 5-7%). If the color variation due to varying brightness becomes a problem then switch to a brightness independent approach (e.g. just juse the AB components and ignore the L component)
  2. Exclude parts of background from foreground detection (there may be some overlap in classification so this order will prevent confusion).
  3. On the remaining binary image, apply a contour search and discard contours with too small areas.
  4. The remaining contours form the book. Create a convex hull which you can use as the object ROI.

Advantages:

  • Very accurate
  • Works across multiple scenarios (changing background, different illumination - if the right color space is used)

Disadvantages:

  • Difficult to implement for a beginner (knowledge of LAB or HSV, color distances, support for multi-color classification, etc.)
  • Color detection completely dependent on background and foreground. That means if the book changes and is e.g. blue, the sample images have to be adapted.
  • This approach won't work, if all of the top, bottom or the sides of the book look like the background. If that is the case, these parts of the book will be classified as background.

Difficulty of a general solution

There are reasons why the current approach, albeit advanced, will not suffice for general application (varying books, varying backgrounds, etc.).

If you want a generic system, that can automatically detect varying books in varying background, you're in for some trouble. That reaches a level of difficulty that will be hard to solve. It kind of reminds me of the detection of licence plates: varying illumination, noise, stained objects, strongly varying backgrounds, bad contrast, etc. And even if you manage this, here is the catch: such a system will only work for specific types of license plates. The same applies to your books.

Tests

Since you posted a very similar question (detecting multi color document with OpenCV4Android), I took the liberty of using the image posted there as well as the ones you provided here. Since one of the images was only available with a red ROI, I used my Photoshop skill level > 9000 to remove the red ROI :).

Sample images for background classification

b

Sample images for foreground classification

234

Images

567

Background classification

8910

Foreground classification

111213

Detected objects

8910



Update

Quick LAB crash course

Since the theory on color spaces is quite vast, you should first read up on some basics and key points. My quick search found this site which nicely explains some important points: http://www.learnopencv.com/color-spaces-in-opencv-cpp-python/ - We will use the float variant of OpenCV since it is the simplest one to use (unaltered LAB range, no scaling, no shfiting, etc.). - LAB value range: L* axis (lightness) ranges from 0 to 100 a* and b* (color attributes) axis range from -128 to +127 Sources and references: What are the ranges of coordinates in the CIELAB color space? http://www.colourphil.co.uk/lab_lch_colour_space.shtml

Color distance

https://en.wikipedia.org/wiki/Color_difference

Essentially, we use the Euclidean distance between the two colors. Of course we can omit components from the two colors we compare, e.g. the luminance component (L).

In order to get an intuitive color distance metric, we can simple normalize the color distances to a range between 0.0 and 1.0. This way we can interpet color distances as deviation in percentage.

Example

Let's use the images from the tutorial page posted above and use them in an example. The example application shows the following things: - BGR to LAB conversion - (L)AB distance calculation - (L)AB distance normalization - Color classification according to BGR/LAB values and color distance thresholding - How colors of objects can change under varying illumination conditions - How the distances to other colors my become bigger/close the darker/lighter the image gets (this also becomes clear if you carefully read the posted link).

Additional tip: The example should show that a single color is often not enough to detect objects of color in strongly varying illumination conditions. A solution could be to use different color distance thresholds for each color by empirical analysis. An alternative is to use many classification sample colors for each color you want to find. You'd have to calcualte the color distance to each of these sample colors and combine the found values by ORing the results.

Code and images

1718

(images taken from http://www.learnopencv.com/color-spaces-in-opencv-cpp-python/ - a tutorial by Satya Mallick)

#include <opencv2/opencv.hpp>

// Normalization factors for (L)AB distance calculation
// LAB range:
// L: 0.0 - 100.0
// A: -128.0 - 127.0
// B: -128.0 - 127.0
static const float labNormalizationFactor = (float)(1.f / (std::sqrt(std::pow(100, 2) + std::pow(255, 2) + std::pow(255, 2))));
static const float abNormalizationFactor = (float)(1.f / (std::sqrt(std::pow(255, 2) + std::pow(255, 2))));

float labExample_calculateLabDistance(const cv::Vec3f& c1, const cv::Vec3f& c2)
{
    return (float)cv::norm(c1, c2) * labNormalizationFactor;
}

float labExample_calculateAbDistance(const cv::Vec3f& c1, const cv::Vec3f& c2)
{
    cv::Vec2f c1Temp(c1(1), c1(2));
    cv::Vec2f c2Temp(c2(1), c2(2));
    return (float)cv::norm(c1Temp, c2Temp) * abNormalizationFactor;
}

void labExample_calculateLabDistance(
    cv::Mat& imgLabFloat,
    cv::Mat& distances,
    const cv::Vec3f labColor,
    const bool useOnlyAbDistance
)
{
    // Get size for general usage
    const auto& size = imgLabFloat.size();

    distances = cv::Mat::zeros(size, CV_32F);
    distances = 1.f;

    for (int y = 0; y < size.height; ++y)
    {       
        for (int x = 0; x < size.width; ++x)
        {   
            // Read LAB value
            const auto& value = imgLabFloat.at<cv::Vec3f>(y,x);

            // Calculate distance
            float distanceValue;
            if (useOnlyAbDistance)
            {
                distanceValue = labExample_calculateAbDistance(value, labColor);
            }
            else
            {
                distanceValue = labExample_calculateLabDistance(value, labColor);
            }

            distances.at<float>(y,x) = distanceValue;
        }
    }
}

// Small hacky function to convert a single 
// BGR color value to LAB float.
// Since the conversion function is not directly available
// we just use a Mat object to do the conversion.
cv::Vec3f labExample_bgrUchar2LabFloat(const cv::Scalar bgr)
{
    // Build Mat with single bgr pixel
    cv::Mat matWithSinglePixel = cv::Mat::zeros(1, 1, CV_8UC3);
    matWithSinglePixel.setTo(bgr);

    // Convert to float and scale accordingly
    matWithSinglePixel.convertTo(matWithSinglePixel, CV_32FC3, 1.0 / 255.0);

    // Convert to LAB and return value
    cv::cvtColor(matWithSinglePixel, matWithSinglePixel, CV_BGR2Lab);
    auto retval = matWithSinglePixel.at<cv::Vec3f>(0, 0);

    return retval;
}

void labExample_convertImageBgrUcharToLabFloat(cv::Mat& src, cv::Mat& dst)
{
    src.convertTo(dst, CV_32FC3, 1.0 / 255.0);
    cv::cvtColor(dst, dst, CV_BGR2Lab);
}

void labExample()
{
    // Load image
    std::string path = "./Testdata/Stackoverflow lab example/";
    std::string filename1 = "1.jpg";
    std::string fqn1 = path + filename1;
    cv::Mat img1 = cv::imread(fqn1, cv::IMREAD_COLOR);
    std::string filename2 = "2.jpg";
    std::string fqn2 = path + filename2;
    cv::Mat img2 = cv::imread(fqn2, cv::IMREAD_COLOR);

    // Combine images by scaling the second image so both images have the same number of columns and then combining them.
    float scalingFactorX = (float)img1.cols / img2.cols;
    float scalingFactorY = scalingFactorX;
    cv::resize(img2, img2, cv::Size(), scalingFactorX, scalingFactorY);

    std::vector<cv::Mat> mats;
    mats.push_back(img1);
    mats.push_back(img2);
    cv::Mat img;
    cv::vconcat(mats, img);

    // Lets use some reference colors.
    // Remember: OpenCV uses BGR as default color space so all colors
    // are BGR by default, too.
    cv::Scalar bgrColorRed(52, 42, 172);
    cv::Scalar bgrColorOrange(3, 111, 219);
    cv::Scalar bgrColorYellow(1, 213, 224);
    cv::Scalar bgrColorBlue(187, 95, 0);
    cv::Scalar bgrColorGray(127, 127, 127);

    // Build LAB image
    cv::Mat imgLabFloat;
    labExample_convertImageBgrUcharToLabFloat(img, imgLabFloat);

    // Convert bgr ref color to lab float.
    // INSERT color you want to analyze here:
    auto colorLabFloat = labExample_bgrUchar2LabFloat(bgrColorRed);

    cv::Mat colorDistancesWithL;
    cv::Mat colorDistancesWithoutL;
    labExample_calculateLabDistance(imgLabFloat, colorDistancesWithL, colorLabFloat, false);
    labExample_calculateLabDistance(imgLabFloat, colorDistancesWithoutL, colorLabFloat, true);

    // Color distances. They can differ for every color being analyzed.
    float maxColorDistanceWithL = 0.07f;
    float maxColorDistanceWithoutL = 0.07f;

    cv::Mat detectedValuesWithL = colorDistancesWithL <= maxColorDistanceWithL;
    cv::Mat detectedValuesWithoutL = colorDistancesWithoutL <= maxColorDistanceWithoutL;

    cv::Mat imgWithDetectedValuesWithL = cv::Mat::zeros(img.size(), CV_8UC3);
    cv::Mat imgWithDetectedValuesWithoutL = cv::Mat::zeros(img.size(), CV_8UC3);

    img.copyTo(imgWithDetectedValuesWithL, detectedValuesWithL);
    img.copyTo(imgWithDetectedValuesWithoutL, detectedValuesWithoutL);

    cv::imshow("img", img);
    cv::imshow("colorDistancesWithL", colorDistancesWithL);
    cv::imshow("colorDistancesWithoutL", colorDistancesWithoutL);
    cv::imshow("detectedValuesWithL", detectedValuesWithL);
    cv::imshow("detectedValuesWithoutL", detectedValuesWithoutL);
    cv::imshow("imgWithDetectedValuesWithL", imgWithDetectedValuesWithL);
    cv::imshow("imgWithDetectedValuesWithoutL", imgWithDetectedValuesWithoutL);
    cv::waitKey();
}

int main(int argc, char** argv)
{
    labExample();
}
like image 57
Baiz Avatar answered Sep 29 '22 19:09

Baiz