Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorflow LSTM RNN output activation function

I have an input image with grey scale values ranging from let's say 25000 - 35000. I'm doing a binary pixel-wise classification, so the output ground truth is a matrix of either 0's or 1's.

Does anyone know what the default output activation function is? My question is, is it a ReLu? I want it to be a SoftMax function. In which case, each prediction value would be between 0 and 1 (obviously close to my ground truth data).

I'm using example code from here that I have adjusted to make work for my data.

I have a working network that is training, but the minibatch loss is at about 425 right now and the accuracy at 0.0, and for the LSTM MNIST example code (linked) the minibatch loss was about 0.1 and the accruacy about 1.0. My hope is that if I can change the activation function to use the SoftMax function, I can improve results.

like image 907
Kendall Weihe Avatar asked Jun 13 '16 18:06

Kendall Weihe


1 Answers

Looking at the code, the default activation function for BasicLSTMCell is tf.tanh(). You can customize the activation function by specifying the optional activation argument when constructing the BasicLSTMCell object, and passing any TensorFlow op that expects a single input and produces a single output of the same shape. For example:

# Defaults to using `tf.tanh()`.
lstm_cell = rnn_cell.BasicLSTMCell(n_hidden, forget_bias=1.0)

# Uses  `tf.relu()`.
lstm_cell = rnn_cell.BasicLSTMCell(n_hidden, forget_bias=1.0, activation=tf.nn.relu)

# Uses  `tf.softmax()`.
lstm_cell = rnn_cell.BasicLSTMCell(n_hidden, forget_bias=1.0, activation=tf.nn.softmax)
like image 58
mrry Avatar answered Sep 22 '22 22:09

mrry