Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to crop biggest rectangle out of an image

I have a few images of pages on a table. I would like to crop the pages out of the image. Generally, the page will be the biggest rectangle in the image, however, all four sides of the rectangle might not be visible in some cases.

I am doing the following but not getting desired results:

import cv2
import numpy as np

im = cv2.imread('images/img5.jpg')
gray=cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(gray,127,255,0)
_,contours,_ = cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
areas = [cv2.contourArea(c) for c in contours]
max_index = np.argmax(areas)
cnt=contours[max_index]
x,y,w,h = cv2.boundingRect(cnt)
cv2.rectangle(im,(x,y),(x+w,y+h),(0,255,0),2)
cv2.imshow("Show",im)
cv2.imwrite("images/img5_rect.jpg", im)
cv2.waitKey(0)

Below are a few examples:

1st Example: I can find the rectangle in this image , however, would like if the remaining part of the wood can be cropped out as well. enter image description here

enter image description here

2nd Example: Not finding the correct dimensions of the rectangle in this image. enter image description here

enter image description here

3rd Example: Not able to find the correct dimensions in this image either. enter image description here enter image description here

4th Example: Same with this as well. enter image description here enter image description here

like image 947
Anthony Avatar asked May 02 '16 12:05

Anthony


2 Answers

As I have previously done something similar, I have experienced with hough transforms, but they were much harder to get right for my case than using contours. I have the following suggestions to help you get started:

  1. Generally paper (edges, at least) is white, so you may have better luck by going to a colorspace like YUV which better separates luminosity:

    image_yuv = cv2.cvtColor(image,cv2.COLOR_BGR2YUV)
    image_y = np.zeros(image_yuv.shape[0:2],np.uint8)
    image_y[:,:] = image_yuv[:,:,0]
    
  2. The text on the paper is a problem. Use a blurring effect, to (hopefully) remove these high frequency noises. You may also use morphological operations like dilation as well.

    image_blurred = cv2.GaussianBlur(image_y,(3,3),0)
    
  3. You may try to apply a canny edge-detector, rather than a simple threshold. Not necessarily, but may help you:

     edges = cv2.Canny(image_blurred,100,300,apertureSize = 3)
    
  4. Then find the contours. In my case I only used the extreme outer contours. You may use CHAIN_APPROX_SIMPLE flag to compress the contour

    contours,hierarchy = cv2.findContours(edges,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
    
  5. Now you should have a bunch of contours. Time to find the right ones. For each contour cnt, first find the convex hull, then use approaxPolyDP to simplify the contour as much as possible.

    hull = cv2.convexHull(cnt)
    simplified_cnt = cv2.approxPolyDP(hull,0.001*cv2.arcLength(hull,True),True)
    
  6. Now we should use this simplified contour to find the enclosing quadrilateral. You may experiment with lots of rules you come up with. The simplest method is picking the four longest longest segments of the contour, and then create the enclosing quadrilateral by intersecting these four lines. Based on your case, you can find these lines based on the contrast the line makes, the angle they make and similar things.

  7. Now you have a bunch of quadrilaterals. You can now perform a two step method to find your required quadrilateral. First you remove those ones that are probably wrong. For example one angle of the quadrilateral is more than 175 degrees. Then you can pick the one with the biggest area as the final result. You can see the orange contour as one of the results I got at this point: All Contours

  8. The final step after finding (hopefully) the right quadrilateral, is transforming back to a rectangle. For this you can use findHomography to come up with a transformation matrix.

    (H,mask) = cv2.findHomography(cnt.astype('single'),np.array([[[0., 0.]],[[2150., 0.]],[[2150., 2800.]],[[0.,2800.]]],dtype=np.single))
    

    The numbers assume projecting to letter paper. You may come up with better and more clever numbers to use. You also need to reorder the contour points to match the order of coordinates of the letter paper. Then you call warpPerspective to create the final image:

    final_image = cv2.warpPerspective(image,H,(2150, 2800))
    

    This warping should result in something like the following (from my results before): Warping

I hope this helps you to find an appropriate approach in your case.

like image 85
Kamyar Infinity Avatar answered Nov 17 '22 09:11

Kamyar Infinity


That's a pretty complicated task which cannot be solved by simply searching contours. The Economist cover for example only shows 1 edge of the magazine which splits the image in half. How should your computer know which one is the magazine and which one is the table? So you have to add much more intelligence to your program.

You might look for lines in your image. Hough transform for example. Then find sets of more or less parallel or orthogonal lines, lines of a certain length... Find prints by checking for typical print colours or colours that you usually don't find on a table. Search for high contrast frequencies as created by printed texts... Imagine how you as a human recognize a printed paper...

All in all this is a too broad question for StackOverflow. Try to break it down into smaller sub-problems, try to solve them and if you hit a wall, come back here.

like image 38
Piglet Avatar answered Nov 17 '22 08:11

Piglet