Why are there two additional variables, in the checkpoint, for each layer?

Question

I created a convolutional neural network with three convolutional layers and two fully connected layers. I used tf.train.saver() to save the variables. When I use inspect_checkpoint.py to check the variables saved in the checkpoint file. Why are there two additional variables saved for each layer, like Adam_1 and Adam? Also, what are beta1_power and beta2_power?

conv_layer1_b  (DT_FLOAT)  [32]

conv_layer1_w  (DT_FLOAT)  [1,16,1,32]

conv_layer1_b/Adam  (DT_FLOAT)  [32]

conv_layer1_w/Adam (DT_FLOAT) [1,16,1,32]

conv_layer1_w/Adam_1 (DT_FLOAT) [1,16,1,32]

conv_layer1_b/Adam_1 (DT_FLOAT) [32]

conv_layer3_w/Adam (DT_FLOAT) [1,16,64,64]

conv_layer3_w (DT_FLOAT) [1,16,64,64]

conv_layer3_b/Adam_1 (DT_FLOAT) [64]

conv_layer3_b (DT_FLOAT) [64]

conv_layer3_b/Adam (DT_FLOAT) [64]

conv_layer3_w/Adam_1 (DT_FLOAT) [1,16,64,64]

conv_layer2_w/Adam_1 (DT_FLOAT) [1,16,32,64]

conv_layer2_w/Adam (DT_FLOAT) [1,16,32,64]

conv_layer2_w (DT_FLOAT) [1,16,32,64]

conv_layer2_b/Adam_1 (DT_FLOAT) [64]

conv_layer2_b (DT_FLOAT) [64]

conv_layer2_b/Adam (DT_FLOAT) [64]

beta1_power (DT_FLOAT) []

beta2_power (DT_FLOAT) []

NN1_w (DT_FLOAT) [2432,512]

NN1_b (DT_FLOAT) [512]

NN1_w/Adam_1 (DT_FLOAT) [2432,512]

NN1_b/Adam_1 (DT_FLOAT) [512]

NN1_w/Adam (DT_FLOAT) [2432,512]

NN1_b/Adam (DT_FLOAT) [512]

NN2_w (DT_FLOAT) [512,2]

NN2_b (DT_FLOAT) [2]

NN2_w/Adam_1 (DT_FLOAT) [512,2]

NN2_b/Adam_1 (DT_FLOAT) [2]

NN2_w/Adam (DT_FLOAT) [512,2]

NN2_b/Adam (DT_FLOAT) [2]

etarion · Accepted Answer

You're using the Adam optimizer (https://arxiv.org/abs/1412.6980) for optimization. Adam has two state variables to store statistics about the gradients which are the same size as the parameters (Algorithm 1), which is your two additional variables per parameter variable. The optimizer itself has a few hyperparameters, among them β₁ and β₂, which I guess are in your case stored as variables.

Why are there two additional variables, in the checkpoint, for each layer?

Tags:

H. Mao

1 Answers

etarion

Recent Activity

Donate For Us

Why are there two additional variables, in the checkpoint, for each layer?

Tags:

H. Mao

1 Answers

etarion

Related questions

Recent Activity

Donate For Us