I've been doing a lot of reading on Conv Nets and even some playing using Julia's Mocha.jl package (which looks a lot like Caffe, but you can play with it in the Julia REPL).
In a Conv net, Convolution layers are followed by "feature map" layers. What I'm wondering is how does one determine how many feature maps a network needs to have to solve some particular problem? Is there any science to this or is it more art? I can see that if you're trying to make a classification at least that last layer should have number of feature maps == number of classes (unless you've got a fully connected MLP at the top of the network, I suppose).
In my case, I 'm not doing a classification so much as trying to come up with a value for every pixel in an image (I suppose this could be seen as a classification where the classes are from 0 to 255).
Edit: as pointed out in the comments, I'm trying to solve a regression problem where the outputs are in a range from 0 to 255 (grayscale in this case). Still, the question remains: How does one determine how many feature maps to use at any given convolution layer? Does this differ for a regression problem vs. a classification problem?
The image is taken as input and then that image is made to pass through all these 10 output functions one by one in serial order. The last output function is the output of the model itself. So, in total there are 9 intermediate output functions and hence 9 intermediate feature maps.
To calculate it, we have to start with the size of the input image and calculate the size of each convolutional layer. In the simple case, the size of the output CNN layer is calculated as “input_size-(filter_size-1)”. For example, if the input image_size is (50,50) and filter is (3,3) then (50-(3–1)) = 48.
The Number of convolutional layers: In my experience, the more convolutional layers the better (within reason, as each convolutional layer reduces the number of input features to the fully connected layers), although after about two or three layers the accuracy gain becomes rather small so you need to decide whether ...
If data is less complex and is having fewer dimensions or features then neural networks with 1 to 2 hidden layers would work. If data is having large dimensions or features then to get an optimum solution, 3 to 5 hidden layers can be used.
Basically, like any other hyperparameter - by evaluting results on separate development set and finding what number works best. It also worth checking publications that deal with similar problem and finding what number of feature maps they were using.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With