Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding darknet's yolo.cfg config files

Tags:

yolo

darknet

I have searched around the internet but found very little information around this, I don't understand what each variable/value represents in yolo's .cfg files. So I was hoping some of you could help, I don't think I'm the only one having this problem, so if anyone knows 2 or 3 variables please post them so that people who needs such info in the future might find them.

The main one that I'd like to know are :

  • batch
  • subdivisions

  • decay

  • momentum

  • channels

  • filters

  • activation

like image 765
Reda Drissi Avatar asked May 17 '18 11:05

Reda Drissi


People also ask

What is CFG file in yolov3?

The neural network model architecture is stored in the yolov3. cfg file, and the pre-trained weights of the neural network are stored in yolov3. weights . There is a file called coco. names that has the list of 80 object class that the model will be able to detect.

What is mask in Yolo cfg?

This could probably be named something better but the mask tells the layer which of the bounding boxes it is responsible for predicting. The first yolo layer predicts 6,7,8 because those are the largest boxes and it's at the coarsest scale. The 2nd yolo layer predicts some smallers ones, etc.

How do I download a Yolo CFG file?

Download the weights and cfg file from https://pjreddie.com/darknet/yolo/ by clicking on the yellow links on the page, marked by red boxes here: Save the yolov2. cfg and the yolov2.


1 Answers

Here is my current understanding of some of the variables. Not necessarily correct though:

[net]

  • batch: That many images+labels are used in the forward pass to compute a gradient and update the weights via backpropagation.
  • subdivisions: The batch is subdivided in this many "blocks". The images of a block are ran in parallel on the gpu.
  • decay: Maybe a term to diminish the weights to avoid having large values. For stability reasons I guess.
  • channels: Better explained in this image :

On the left we have a single channel with 4x4 pixels, The reorganization layer reduces the size to half then creates 4 channels with adjacent pixels in different channels. figure

  • momentum: I guess the new gradient is computed by momentum * previous_gradient + (1-momentum) * gradient_of_current_batch. Makes the gradient more stable.
  • adam: Uses the adam optimizer? Doesn't work for me though
  • burn_in: For the first x batches, slowly increase the learning rate until its final value (your learning_rate parameter value). Use this to decide on a learning rate by monitoring until what value the loss decreases (before it starts to diverge).
  • policy=steps: Use the steps and scales parameters below to adjust the learning rate during training
  • steps=500,1000: Adjust the learning rate after 500 and 1000 batches
  • scales=0.1,0.2: After 500, multiply the LR by 0.1, then after 1000 multiply again by 0.2
  • angle: augment image by rotation up to this angle (in degree)

layers

  • filters: How many convolutional kernels there are in a layer.
  • activation: Activation function, relu, leaky relu, etc. See src/activations.h
  • stopbackward: Do backpropagation until this layer only. Put it in the panultimate convolution layer before the first yolo layer to train only the layers behind that, e.g. when using pretrained weights.
  • random: Put in the yolo layers. If set to 1 do data augmentation by resizing the images to different sizes every few batches. Use to generalize over object sizes.

Many things are more or less self-explanatory (size, stride, batch_normalize, max_batches, width, height). If you have more questions, feel free to comment.

Again, please keep in mind that I am not 100% certain about many of those.

like image 89
FelEnd Avatar answered Oct 16 '22 09:10

FelEnd