Tensorflow: How to set the learning rate in log scale and some Tensorflow questions

Tags:

I am a deep learning and Tensorflow beginner and I am trying to implement the algorithm in this paper using Tensorflow. This paper uses Matconvnet+Matlab to implement it, and I am curious if Tensorflow has the equivalent functions to achieve the same thing. The paper said:

The network parameters were initialized using the Xavier method [14]. We used the regression loss across four wavelet subbands under l2 penalty and the proposed network was trained by using the stochastic gradient descent (SGD). The regularization parameter (λ) was 0.0001 and the momentum was 0.9. The learning rate was set from 10−1 to 10−4 which was reduced in log scale at each epoch.

This paper uses wavelet transform (WT) and residual learning method (where the residual image = WT(HR) - WT(HR'), and the HR' are used for training). Xavier method suggests to initialize the variables normal distribution with

stddev=sqrt(2/(filter_size*filter_size*num_filters)

Q1. How should I initialize the variables? Is the code below correct?

weights = tf.Variable(tf.random_normal[img_size, img_size, 1, num_filters], stddev=stddev)

This paper does not explain how to construct the loss function in details . I am unable to find the equivalent Tensorflow function to set the learning rate in log scale (only exponential_decay). I understand MomentumOptimizer is equivalent to Stochastic Gradient Descent with momentum.

Q2: Is it possible to set the learning rate in log scale?

Q3: How to create the loss function described above?

I followed this website to write the code below. Assume model() function returns the network mentioned in this paper and lamda=0.0001,

inputs = tf.placeholder(tf.float32, shape=[None, patch_size, patch_size, num_channels])
labels = tf.placeholder(tf.float32, [None, patch_size, patch_size, num_channels])

# get the model output and weights for each conv
pred, weights = model()

# define loss function
loss = tf.nn.softmax_cross_entropy_with_logits_v2(labels=labels, logits=pred)

for weight in weights:
    regularizers += tf.nn.l2_loss(weight)

loss = tf.reduce_mean(loss + 0.0001 * regularizers)

learning_rate = tf.train.exponential_decay(???) # Not sure if we can have custom learning rate for log scale
optimizer = tf.train.MomentumOptimizer(learning_rate, momentum).minimize(loss, global_step)

NOTE: As I am a deep learning/Tensorflow beginner, I copy-paste code here and there so please feel free to correct it if you can ;)

975

asked Nov 22 '17 10:11

chesschi

1 Answers

Q1. How should I initialize the variables? Is the code below correct?

Use tf.get_variable or switch to slim (it does the initialization automatically for you). example

Q2: Is it possible to set the learning rate in log scale?

You can but do you need it? This is not the first thing that you need to solve in this network. Please check #3

However, just for reference, use following notation.

learning_rate_node = tf.train.exponential_decay(learning_rate=0.001, decay_steps=10000, decay_rate=0.98, staircase=True)

optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate_node).minimize(loss)

Q3: How to create the loss function described above?

At first, you have not written "pred" to "image" conversion to this message(Based on the paper you need to apply subtraction and IDWT to obtain final image).

There is one problem here, logits have to be calculated based on your label data. i.e. if you will use marked data as "Y : Label", you need to write

pred = model()

pred = tf.matmul(pred, weights) + biases

logits = tf.nn.softmax(pred)

loss = tf.reduce_mean(tf.abs(logits - labels))

This will give you the output of Y : Label to be used

If your dataset's labeled images are denoised ones, in this case you need to follow this one:

pred = model()

pred = tf.matmul(image, weights) + biases

logits = tf.nn.softmax(pred)

image = apply_IDWT("X : input", logits) # this will apply IDWT(x_label - y_label)

loss = tf.reduce_mean(tf.abs(image - labels))

Logits are the output of your network. You will use this one as result to calculate the rest. Instead of matmul, you can add a conv2d layer in here without a batch normalization and an activation function and set output feature count as 4. Example:

pred = model()

pred = slim.conv2d(pred, 4, [3, 3], activation_fn=None, padding='SAME', scope='output')

logits = tf.nn.softmax(pred)

image = apply_IDWT("X : input", logits) # this will apply IDWT(x_label - y_label)

loss = tf.reduce_mean(tf.abs(logits - labels))

This loss function will give you basic training capabilities. However, this is L1 distance and it may suffer from some issues (check). Think following situation

Let's say you have following array as output [10, 10, 10, 0, 0] and you try to achieve [10, 10, 10, 10, 10]. In this case, your loss is 20 (10 + 10). However, you have 3/5 success. Also, it may indicate some overfit.

For same case, think following output [6, 6, 6, 6, 6]. It still has loss of 20 (4 + 4 + 4 + 4 + 4). However, whenever you apply threshold of 5, you can achieve 5/5 success. Hence, this is the case that we want.

If you use L2 loss, for the first case, you will have 10^2 + 10^2 = 200 as loss output. For the second case, you will get 4^2 * 5 = 80. Hence, optimizer will try to run away from #1 as quick as possible to achieve global success rather than perfect success of some outputs and complete failure of the others. You can apply loss function like this for that.

tf.reduce_mean(tf.nn.l2_loss(logits - image))

Alternatively, you can check for cross entropy loss function. (it does apply softmax internally, do not apply softmax twice)

tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(pred, image))

105

answered Nov 15 '22 17:11

Deniz Beker

Related questions
                            
                                Retrieve list of log names from Google Cloud Stackdriver API with Python
                            
                                How do I create and sign certificates with Python's pyOpenSSL?
                            
                                Dataframe apply doesn't accept axis argument
                            
                                Jinja: variable inside string inside if statement
                            
                                Why use WTForms instead of just posting with HTML
                            
                                Binary Keras LSTM model does not output binary predictions
                            
                                How to do scatter and gather operations in numpy?
                            
                                Making a non-overlapping bubble chart in Matplotlib (circle packing)
                            
                                Sqlalchemy two filters vs one and
                            
                                How to change numeric data format for html output
                            
                                Extract IPs and Ports from a list in Python 3.x
                            
                                Sinusoidal embedding - Attention is all you need
                            
                                How to apply the torch.inverse() function of PyTorch to every sample in the batch?
                            
                                Latex and text in matplotlib title
                            
                                How to suppress "Build Progress" bar when training an h2o model?
                            
                                Convert Markdown tables to html tables using Python
                            
                                matplotlib - wrap text in legend
                            
                                Optimizer returning None
                            
                                [Pandas]how to get top-n% records within each group
                            
                                Using Python 3 in Azure Functions

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Tensorflow: How to set the learning rate in log scale and some Tensorflow questions

Tags:

python

tensorflow

deep-learning

deep-residual-networks