Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Digit Recognition on CNN

I am testing printed digits (0-9) on a Convolutional Neural Network. It is giving 99+ % accuracy on the MNIST Dataset, but when I tried it using fonts installed on computer (Ariel, Calibri, Cambria, Cambria math, Times New Roman) and trained the images generated by fonts (104 images per font(Total 25 fonts - 4 images per font(little difference)) the training error rate does not go below 80%, i.e. 20% accuracy. Why?

Here is "2" number Images sample -

"2" Number Images

I resized every image 28 x 28.

Here is more detail :-

Training data size = 28 x 28 images. Network parameters - As LeNet5 Architecture of Network -

Input Layer -28x28
| Convolutional Layer - (Relu Activation);
| Pooling Layer - (Tanh Activation)
| Convolutional Layer - (Relu Activation)
| Local Layer(120 neurons) - (Relu)
| Fully Connected (Softmax Activation, 10 outputs)

This works, giving 99+% accuracy on MNIST. Why is so bad with computer-generated fonts? A CNN can handle lot of variance in data.

like image 532
kumar030 Avatar asked Jul 15 '16 07:07

kumar030


People also ask

How does CNN digit recognition work?

The CNN will converge more faster on values 0 to 1 than 0 to 255. So we divide every value by 255 to scale the data from [0.. 255] to [0..1]. It helps the model to better learning of features by decreasing computational complexities if we have data that scales bigger.

What is CNN in handwritten digit recognition?

Handwritten digit recognition can be performed using the Convolutional neural network from Machine Learning. Using the MNIST (Modified National Institute of Standards and Technologies) database and compiling with the CNN gives the basic structure of my project development.

Can CNN be used for numerical data?

Yes, you can use a CNN. CNN's are not limited to just images. Use a 1D convolution, not a 2D convolution; you have 1D data, so a 1D convolution is more appropriate.

What is digit recognition used for?

The applications of digit recognition include in postal mail sorting, bank check processing, form data entry, etc. The main problem lies within the ability on developing an efficient algorithm that can recognize hand written digits, which is submitted by users by the way of a scanner, tablet, and other digital devices.


1 Answers

I see two likely problems:

Preprocessing: MNIST is not only 28px x 28px, but also:

The original black and white (bilevel) images from NIST were size normalized to fit in a 20x20 pixel box while preserving their aspect ratio. The resulting images contain grey levels as a result of the anti-aliasing technique used by the normalization algorithm. the images were centered in a 28x28 image by computing the center of mass of the pixels, and translating the image so as to position this point at the center of the 28x28 field.

Source: MNIST website

Overfitting:

  • MNIST has 60,000 training examples and 10,000 test examples. How many do you have?
  • Did you try dropout (see paper)?
  • Did you try dataset augmentation techniques? (e.g. slightly shifting the image, probably changing the aspect ratio a bit, you could also add noise - however, I don't think those will help)
  • Did you try smaller networks? (And how big are your filters / how many filters do you have?)

Remarks

Interesting idea! Did you try simply applying the trained MNIST network on your data? What are the results?

like image 119
Martin Thoma Avatar answered Sep 23 '22 05:09

Martin Thoma