Remove noisy lines from an image

Tags:

I have images that are noised with some random lines like the following one:
enter image description here
I want to apply on them some preprocessing in order to remove the unwanted noise ( the lines that distort the writing) so that I can use them with OCR (Tesseract).
The idea that came to my mind is to use dilation to remove the noise then use erosion to fix the missing parts of the writing in a second step.
For that, I used this code:

Click to copy

import cv2
import numpy as np

img = cv2.imread('linee.png', cv2.IMREAD_GRAYSCALE)
kernel = np.ones((5, 5), np.uint8)
img = cv2.dilate(img, kernel, iterations=1)
img = cv2.erode(img, kernel, iterations=1)
cv2.imwrite('delatedtest.png', img)

Unfortunately, the dilation didn't work well, The noise lines are still existing.

enter image description here
I tried changing the kernel shape, but it got worse: the writing were partially or completely deleted.
I also found an answer saying that it is possible to remove the lines by

turning all black pixels with two or less adjacent black pixels to white.

That seems a bit complicated for me since I am beginner to computer vision and opencv.
Any help would be appreciated, thank you.

311

asked Jan 03 '19 19:01

test

1 Answers

Detecting lines like these is what the path opening was invented for. DIPlib has an implementation (disclosure: I implemented it there). As an alternative, you can try using the implementation by the authors of the paper that I linked above. That implementation does not have the "constrained" mode that I use below.

Here is a quick demo for how you can use it:

Click to copy

import diplib as dip
import matplotlib.pyplot as pp

img = 1 - pp.imread('/home/cris/tmp/DWRTF.png')
lines = dip.PathOpening(img, length=300, mode={'constrained'})

Here we first inverted the image because that makes other things later easier. If not inverting, use a path closing instead. The lines image:

lines

Next we subtract the lines. A small area opening removes the few isolated pixels of the line that were filtered out by the path opening:

Click to copy

text = img - lines
text = dip.AreaOpening(text, filterSize=5)

text

However, we've now made gaps in the text. Filling these up is not trivial. Here is a quick-and-dirty attempt, which you can use as a starting point:

Click to copy

lines = lines > 0.5
text = text > 0.5
lines -= dip.BinaryPropagation(text, lines, connectivity=-1, iterations=3)
img[lines] = 0

final result

answered Sep 19 '22 17:09

Cris Luengo

Related questions
                            
                                Seaborn pairplot off-diagonal KDE with two classes
                            
                                How do you get the current figure number in Python's matplotlib?
                            
                                Seaborn Heatmap Subplots - keep axis ratio consistent
                            
                                Pandas DataFrame: set_index with inplace=True returns a NoneType, why?
                            
                                python pandas: diff between 2 dates in a groupby
                            
                                How to parse URL encoded data recieved via POST
                            
                                Unittest - Assert a set of items of a list are (or not) contained in another list
                            
                                Adding noise to numpy array
                            
                                Mentioning users via slack in webhooks
                            
                                Using pytest where test in subfolder
                            
                                Supervisor: ERROR (spawn error) when trying to launch gunicorn
                            
                                Python Convert string to dict
                            
                                Python Multiprocessing within Jupyter Notebook
                            
                                Type annotation for boto3 resources like DynamoDB.Table
                            
                                Importing flask.ext raises ModuleNotFoundError
                            
                                Check if numpy array is contiguous?
                            
                                Tensorflow Object Detection API no train.py file
                            
                                Convert 1d array to lower triangular matrix
                            
                                Kafka producer difference between flush and poll
                            
                                What's the difference between Dash and Plotly?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Remove noisy lines from an image

Tags:

python

image-processing

opencv

noise-reduction

test

People also ask

1 Answers

Cris Luengo

Recent Activity

Donate For Us