I have an image containing text but with non straight lines drawn on it. <img src="https://i.stack.imgur.com/6UEr7.jpg" alt="enter image description here"> I want to remove those lines without affecting/removing anything from the text. For that I used Hough probabilistic transform: <pre class="prettyprint"><code>import cv2 import numpy as np def remove_lines(filename): img = cv2.imread(filename) gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) edges = cv2.Canny(gray, 50, 200) lines = cv2.HoughLinesP(edges, rho=1, theta=1*np.pi/180, threshold=100, minLineLength=100, maxLineGap=5) # Draw lines on the image for line in lines: x1, y1, x2, y2 = line[0] cv2.line(img, (x1, y1), (x2, y2), (0, 0, 255), 3) cv2.imwrite('result', img) </code></pre> The result was not as good as I expected: <img src="https://i.stack.imgur.com/5Y5FL.jpg" alt="enter image description here"> The lines were not entirely detected (only some segments, the straight segments, of the lines were detected). I did some adjustments on <code>cv2.Canny</code> and <code>cv2.HoughLinesP</code> parameters, but it didn't work too. I also tried <code>cv2.createLineSegmentDetector</code> (Not available in the latest version of opencv due to license issue, so I had to downgrade opencv to version 4.0.0.21): <pre class="prettyprint"><code>import cv2 import numpy as np def remove_lines(filename): im = cv2.imread(filename) gray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY) # Create default parametrization LSD lsd = cv2.createLineSegmentDetector(0) # Detect lines in the image (Position 0 of the returned tuple are the # detected lines) lines = lsd.detect(gray)[0] # drawn_img = lsd.drawSegments(res, lines) for element in lines: if (abs(int(element[0][0]) - int(element[0][2])) > 70 or abs(int(element[0][1]) - int(element[0][3])) > 70): cv2.line(im, (int(element[0][0]), int(element[0][1])), (int( element[0][2]), int(element[0][3])), (0, 0, 255), 3) cv2.imwrite('lsd.jpg', im) </code></pre> The result was a bit better, but didn't detect the entire lines. <img src="https://i.stack.imgur.com/ISih9.jpg" alt="enter image description here"> Any idea how to make the lines detection more efficient?

Typical methods to remove lines are to use horizontal/vertical kernels or <code>cv2.HoughLinesP()</code> but these methods only work if the lines are straight. In this case, the lines are not straight so an idea is to use a diagonal kernel, morphological transformations, and contour filtering to remove the lines from the text. I will be using a previous answer's approach found in removing horizontal lines in an image but with a diagonal kernel <hr> We begin by converting the image to grayscale and perform Otsu's threshold to obtain a binary image. Next we create a diagonal kernel then perform morph close to detect/filter out the diagonal lines. Since <code>cv2.getStructuringElement()</code> does not have any built in diagonal kernel, we create our own <img src="https://i.stack.imgur.com/OUXsz.png" height="400"> <pre class="prettyprint"><code># Read in image, grayscale, and Otsu's threshold image = cv2.imread('1.jpg') gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) thresh = cv2.threshold(gray, 0, 255,cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1] # Create diagonal kernel kernel = np.array([[0, 0, 1], [0, 1, 0], [1, 0, 0]], dtype=np.uint8) opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=1) </code></pre> The image isolated the main diagonal lines but it also included small lines from the text. To remove them we find contours and filter using contour area. If the contour passes our filter, we effectively remove the noise by "filling in" the contour with <code>cv2.drawContours()</code>. This leaves us with our desired diagonal lines to remove <img src="https://i.stack.imgur.com/88Bm4.png" height="400"> <pre class="prettyprint"><code># Find contours and filter using contour area to remove noise cnts = cv2.findContours(opening, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) cnts = cnts[0] if len(cnts) == 2 else cnts[1] for c in cnts: area = cv2.contourArea(c) if area < 500: cv2.drawContours(opening, [c], -1, (0,0,0), -1) </code></pre> From here we simply <code>cv2.bitwise_xor()</code> with the original image to get our result <img src="https://i.stack.imgur.com/pWovf.png" height="400"> <pre class="prettyprint"><code># Bitwise-xor with original image opening = cv2.merge([opening, opening, opening]) result = cv2.bitwise_xor(image, opening) </code></pre> <hr> Notes: It is difficult to remove the lines without affecting the text although it is possible and will need some clever tricks to "repair" the text. Take a look at remove borders from image but keep text written on borders for a method to reconstruct the missing text. Another method to isolate the diagonal lines would be to take a contrarian approach; instead of trying to detect diagnoal lines, why not try to determine what is not a diagnoal line. You could probably do this by simple filtering techniques. To create dynamic diagonal kernels, you could use <code>np.diag()</code> for different diagonal line widths Full code for completeness <pre class="prettyprint"><code>import cv2 import numpy as np # Read in image, grayscale, and Otsu's threshold image = cv2.imread('1.jpg') gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) thresh = cv2.threshold(gray, 0, 255,cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1] # Create diagonal kernel kernel = np.array([[0, 0, 1], [0, 1, 0], [1, 0, 0]], dtype=np.uint8) opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=1) # Find contours and filter using contour area to remove noise cnts = cv2.findContours(opening, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) cnts = cnts[0] if len(cnts) == 2 else cnts[1] for c in cnts: area = cv2.contourArea(c) if area < 500: cv2.drawContours(opening, [c], -1, (0,0,0), -1) # Bitwise-xor with original image opening = cv2.merge([opening, opening, opening]) result = cv2.bitwise_xor(image, opening) cv2.imshow('thresh', thresh) cv2.imshow('opening', opening) cv2.imshow('result', result) cv2.waitKey() </code></pre>

Remove non straight lines from text image

Tags:

python

image

image-processing

opencv

line

I have an image containing text but with non straight lines drawn on it.

enter image description here

I want to remove those lines without affecting/removing anything from the text.
For that I used Hough probabilistic transform:

import cv2
import numpy as np


def remove_lines(filename):
    img = cv2.imread(filename)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    edges = cv2.Canny(gray, 50, 200)
    lines = cv2.HoughLinesP(edges, rho=1, theta=1*np.pi/180,
                            threshold=100, minLineLength=100, maxLineGap=5)
    # Draw lines on the image
    for line in lines:
        x1, y1, x2, y2 = line[0]
        cv2.line(img, (x1, y1), (x2, y2), (0, 0, 255), 3)

    cv2.imwrite('result', img)

The result was not as good as I expected:

enter image description here

The lines were not entirely detected (only some segments, the straight segments, of the lines were detected).
I did some adjustments on cv2.Canny and cv2.HoughLinesP parameters, but it didn't work too.

I also tried cv2.createLineSegmentDetector (Not available in the latest version of opencv due to license issue, so I had to downgrade opencv to version 4.0.0.21):

import cv2
import numpy as np
def remove_lines(filename):
    im = cv2.imread(filename)
    gray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
    # Create default parametrization LSD
    lsd = cv2.createLineSegmentDetector(0)

    # Detect lines in the image (Position 0 of the returned tuple are the
    # detected lines)
    lines = lsd.detect(gray)[0]

    # drawn_img = lsd.drawSegments(res, lines)
    for element in lines:
        if (abs(int(element[0][0]) - int(element[0][2])) > 70 or
                abs(int(element[0][1]) - int(element[0][3])) > 70):
            cv2.line(im, (int(element[0][0]), int(element[0][1])), (int(
                element[0][2]), int(element[0][3])), (0, 0, 255), 3)
    cv2.imwrite('lsd.jpg', im)

The result was a bit better, but didn't detect the entire lines.

enter image description here

Any idea how to make the lines detection more efficient?

855

asked Oct 24 '19 11:10

singrium

1 Answers

Typical methods to remove lines are to use horizontal/vertical kernels or cv2.HoughLinesP() but these methods only work if the lines are straight. In this case, the lines are not straight so an idea is to use a diagonal kernel, morphological transformations, and contour filtering to remove the lines from the text. I will be using a previous answer's approach found in removing horizontal lines in an image but with a diagonal kernel

We begin by converting the image to grayscale and perform Otsu's threshold to obtain a binary image. Next we create a diagonal kernel then perform morph close to detect/filter out the diagonal lines. Since cv2.getStructuringElement() does not have any built in diagonal kernel, we create our own

# Read in image, grayscale, and Otsu's threshold
image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255,cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Create diagonal kernel
kernel = np.array([[0, 0, 1],
                   [0, 1, 0],
                   [1, 0, 0]], dtype=np.uint8)
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=1)

The image isolated the main diagonal lines but it also included small lines from the text. To remove them we find contours and filter using contour area. If the contour passes our filter, we effectively remove the noise by "filling in" the contour with cv2.drawContours(). This leaves us with our desired diagonal lines to remove

# Find contours and filter using contour area to remove noise
cnts = cv2.findContours(opening, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    area = cv2.contourArea(c)
    if area < 500:
        cv2.drawContours(opening, [c], -1, (0,0,0), -1)

From here we simply cv2.bitwise_xor() with the original image to get our result

# Bitwise-xor with original image
opening = cv2.merge([opening, opening, opening])
result = cv2.bitwise_xor(image, opening)

Notes: It is difficult to remove the lines without affecting the text although it is possible and will need some clever tricks to "repair" the text. Take a look at remove borders from image but keep text written on borders for a method to reconstruct the missing text. Another method to isolate the diagonal lines would be to take a contrarian approach; instead of trying to detect diagnoal lines, why not try to determine what is not a diagnoal line. You could probably do this by simple filtering techniques. To create dynamic diagonal kernels, you could use np.diag() for different diagonal line widths

Full code for completeness

import cv2
import numpy as np

# Read in image, grayscale, and Otsu's threshold
image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255,cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Create diagonal kernel
kernel = np.array([[0, 0, 1],
                   [0, 1, 0],
                   [1, 0, 0]], dtype=np.uint8)
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=1)

# Find contours and filter using contour area to remove noise
cnts = cv2.findContours(opening, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    area = cv2.contourArea(c)
    if area < 500:
        cv2.drawContours(opening, [c], -1, (0,0,0), -1)

# Bitwise-xor with original image
opening = cv2.merge([opening, opening, opening])
result = cv2.bitwise_xor(image, opening)

cv2.imshow('thresh', thresh)
cv2.imshow('opening', opening)
cv2.imshow('result', result)
cv2.waitKey()

180

answered Oct 09 '22 02:10

nathancy

Related questions
                            
                                Using line_profiler with numba jitted functions
                            
                                GDAL : Reprojecting netCDF file
                            
                                How to use DRF serializers with Graphene
                            
                                How to pass parameters to a training script in Azure Machine Learning service?
                            
                                Jupyter notebook: let a user inputs a drawing
                            
                                Altair change the orientation of column labels
                            
                                Is splitting assignment into two lines still just as efficient?
                            
                                How to get two lists that have the most elements in common in a nested list in Python
                            
                                Django Allauth seems to log user out after a few days of inactivity
                            
                                Tensorflow NotFoundError: libtensorflow_framework.so: cannot open shared file or directory
                            
                                Keras vs PyTorch LSTM different results
                            
                                Why is `len(l) != 0` faster than `bool(l)` in CPython?
                            
                                Python Requests Stream Data from API
                            
                                Plotting issue (matplotlib): "ValueError: posx and posy should be finite values"
                            
                                What the difference between read() and read1() in Python?
                            
                                Why using numpy.random.seed is not a good practice?
                            
                                Why does float.__repr__ return a different representation compared to the equivalent formatting option?
                            
                                PySpark; DecimalType multiplication precision loss
                            
                                Speeding up pandas profiling analysis using check_correlation?
                            
                                Where and in what context did Guido van Rossum say "If you want your code to run faster, you should probably just use PyPy."? [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Remove non straight lines from text image

Tags:

python

image

image-processing

opencv

line

singrium

People also ask

1 Answers

nathancy

Recent Activity

Donate For Us