Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to input the image to the neural network?

I understand how neural networks work, but if I want to use them for image processing like actual character recognition, I can't understand how can I input the image data to the neural net.

I have a very big image of an A letter. Maybe I should try to get some info/specifications from the image and then use a vector of values of that specification? And they will be the input for the neural net?

Who has already done such a thing, can you explain how to do this?

like image 227
Dzen Avatar asked Jan 18 '10 08:01

Dzen


People also ask

How do neural networks see images?

In essence, the features used by a neural network can be visualized by optimizing the input images with respect to a loss function based on the excitation of a single neuron, a feature map or an entire layer of the network.

How does a neural network take input?

The input layer of a neural network is composed of artificial input neurons, and brings the initial data into the system for further processing by subsequent layers of artificial neurons. The input layer is the very beginning of the workflow for the artificial neural network.


2 Answers

The easiest solution would be to normalize all of your images, both for training and testing, to have the same resolution. Also the character in each image should be about the same size. It is also a good idea to use greyscale images, so each pixel would give you just one number. Then you could use each pixel value as one input to your network. For instance, if you have images of size 16x16 pixels, your network would have 16*16 = 256 input neurons. The first neuron would see the value of the pixel at (0,0), the second at (0,1), and so on. Basically you put the image values into one vector and feed this vector into the network. This should already work.

By first extracting features (e.g., edges) from the image and then using the network on those features, you could perhaps increase the speed of learning and also make the detection more robust. What you do in that case is incorporating prior knowledge. For character recognition you know certain relevant features. So by extracting them as a preprocessing step, the network doesn't have to learn those features. However, if you provide the wrong, i.e. irrelevant, features, the network will not be able to learn the image --> character mapping.

like image 105
ahans Avatar answered Sep 28 '22 22:09

ahans


The name for the problem you're trying to solve is "feature extraction". It's decidedly non-trivial and a subject of active research.

The naive way to go about this is simply to map each pixel of the image to a corresponding input neuron. Obviously, this only works for images that are all the same size, and is generally of limited effectiveness.

Beyond this, there is a host of things you can do... Gabor filters, Haar-like features, PCA and ICA, sparse features, just to name a few popular examples. My advice would be to pick up a textbook on neural networks and pattern recognition or, specifically, optical character recognition.

like image 45
Martin B Avatar answered Sep 28 '22 22:09

Martin B