I would like to capture the number from this kind of picture.
I tried multi-scale matching from the following link.
http://www.pyimagesearch.com/2015/01/26/multi-scale-template-matching-using-python-opencv/
All I want to know is the red number. But the problem is, the red number is blurry for openCV recognize/match template. Would there be other possible way to detect this red number on the black background?
If you have any background in signal processing, the first method to consider would be computing the Fast Fourier Transform of the image and then examining the distribution of low and high frequencies — if there are a low amount of high frequencies, then the image can be considered blurry.
Image blur is calculated with the basic formula: b = dwScos0 . This equation uses exposure duration directly.
Average blurring ( cv2. This allows us to reduce noise and the level of detail, simply by relying on the average.
You can also sharpen an image with a 2D-convolution kernel. First define a custom 2D kernel, and then use the filter2D() function to apply the convolution operation to the image. In the code below, the 3×3 kernel defines a sharpening kernel. Check out this resource to learn more about commonly used kernels.
Classifying Digits
You clarified in comments that you've already isolated the number part of the image pre-detection, so I'll start under that assumption.
Perhaps you can approximate the perspective effects and "blurriness" of the number by treating it as a hand-written number. In this case, there is a famous data-set of handwritten numerals for classification training called mnist.
Yann LeCun has enumerated the state of the art on this dataset here mnist hand-written dataset.
At the far end of the spectrum, convolutional neural networks yield outrageously low error rates (fractions of 1% error). For a simpler solution, k-nearest neighbours using deskewing, noise removal, blurring, and 2 pixel shift, yielded about 1% error, and is significantly faster to implement. Python opencv has an implementation. Neural networks and support vector machines with deskewing also have some pretty impressive performance rates.
Note that convolutional networks don't have you pick your own features, so the important color-differential information here might just be used for narrowing the region-of-interest. Other approaches, where you define your feature space, might incorporate the known color difference more precisely.
Python supports a lot of machine learning techniques in the terrific package sklearn - here are examples of sklearn applied to mnist. If you're looking for an tutorialized explanation of machine learning in python, sklearn's own tutorial is very verbose
From the sklearn link:
Those are the kinds of items you're trying to classify if you learn using this approach. To emphasize how easy it is to start training some of these machine learning-based classifiers, here is an abridged section from the example code in the linked sklearn package:
digits = datasets.load_digits() # built-in to sklearn!
data = digits.images.reshape((len(digits.images), -1))
# Create a classifier: a support vector classifier
classifier = svm.SVC(gamma=0.001)
# We learn the digits on the first half of the digits
classifier.fit(data[:n_samples / 2], digits.target[:n_samples / 2])
If you're wedded to openCv (possibly because you want to port to a real-time system in the future), opencv3/python has a tutorial on this exact topic too! Their demo uses k-nearest-neighbor (listed in the LeCun page), but they also have svms and the many of the other tools in sklearn. Their ocr page using SVMs uses deskewing, which might be useful with the perspective effect in your problem:
UPDATE: I used the out-of-the box skimage approach outlined above on your image, heavily cropped, and it correctly classified it. A lot more testing would be required to see if this is rhobust in practice
^^ That tiny image is the 8x8 crop of the image you embedded in your question. mnist is 8x8 images. That's why it trains in less than a second with default arguments in skimage.
I converted it the correct format by scaling it up to the mnist range using
number = scipy.misc.imread("cropped_image.png")
datum = (number[:,:,0]*15).astype(int).reshape((64,))
classifier.predict(datum) # returns 8
I didn't change anything else from the example; here, I'm only using the first channel for classification, and no smart feature computation. 15 looked about right to me; you'll need to tune it to get within the target range or (ideally) provide your own training and testing set
Object Detection
If you haven't isolated the number in the image you'll need an object detector. The literature space on this problem is gigantic and I won't start down that rabbit hole (google Viola and Jones, maybe?) This blog covers the fundamentals of a "sliding window" detector in python. Adrian Rosebrock looks like he's even a contributor on SO, and that page has some good examples of opencv and python-based object detectors fairly tutorialized (you actually linked to that blog in your question, I didn't realize).
In short, classify windows across the image and pick the window of highest confidence. Narrowing down the search space with a region of interest will of course yield huge improvements in all areas of performance
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With