How to implement Tensorflow batch normalization in LSTM

Tags:

My current LSTM network looks like this.

rnn_cell = tf.contrib.rnn.BasicRNNCell(num_units=CELL_SIZE)
init_s = rnn_cell.zero_state(batch_size=1, dtype=tf.float32)  # very first hidden state
outputs, final_s = tf.nn.dynamic_rnn(
    rnn_cell,              # cell you have chosen
    tf_x,                  # input
    initial_state=init_s,  # the initial hidden state
    time_major=False,      # False: (batch, time step, input); True: (time step, batch, input)
)

# reshape 3D output to 2D for fully connected layer
outs2D = tf.reshape(outputs, [-1, CELL_SIZE])
net_outs2D = tf.layers.dense(outs2D, INPUT_SIZE)

# reshape back to 3D
outs = tf.reshape(net_outs2D, [-1, TIME_STEP, INPUT_SIZE])

Usually, I apply tf.layers.batch_normalization as batch normalization. But I am not sure if this works in a LSTM network.

b1 = tf.layers.batch_normalization(outputs, momentum=0.4, training=True)
d1 = tf.layers.dropout(b1, rate=0.4, training=True)

# reshape 3D output to 2D for fully connected layer
outs2D = tf.reshape(d1, [-1, CELL_SIZE])                       
net_outs2D = tf.layers.dense(outs2D, INPUT_SIZE)

# reshape back to 3D
outs = tf.reshape(net_outs2D, [-1, TIME_STEP, INPUT_SIZE])

669

asked Oct 24 '17 16:10

BenjiBB

1 Answers

If you want to use batch norm for RNN (LSTM or GRU), you can check out this implementation , or read the full description from blog post.

However, the layer-normalization has more advantage than batch norm in sequence data. Specifically, "the effect of batch normalization is dependent on the mini-batch size and it is not obvious how to apply it to recurrent networks" (from the paper Ba, et al. Layer normalization).

For layer normalization, it normalizes the summed inputs within each layer. You can check out the implementation of layer-normalization for GRU cell:

158

answered Oct 02 '22 16:10

ngovanmao

Related questions
                            
                                Flask + RabbitMQ + SocketIO - forwarding messages
                            
                                What is the fastest way to compare patches of an array?
                            
                                Why is regex search in substring "not completely equivalent to slicing the string" in Python?
                            
                                Why is pip, inside a virtualenv, writing to /usr/lib?
                            
                                Numpy octuple precision floats and 128 bit ints. Why and how?
                            
                                Pandas GroupBy memory deallocation
                            
                                Bug in SQLAlchemy Rollback after DB Exception?
                            
                                Storing tensorflow models in memory
                            
                                Python Tuple in Java XMLRPC
                            
                                Python packaging: Generate a python file at installation time, have this work with tox
                            
                                What are the best practices for combining marshmallow schema definitions and OO in Python? [closed]
                            
                                How can I make QScintilla auto-indent like SublimeText?
                            
                                Is there a Google Data API (gdata) for Python 3.x?
                            
                                Referencing Python "import" assemblies when calling from IronPython in C#
                            
                                Dynamically setting Flask-SQLAlchemy database connection in multi-tenant app
                            
                                Twisted XmlStream: How to connect to events?
                            
                                python: bandpass filter of an image
                            
                                Indirect inline in Django admin
                            
                                Python JSON dummy data generation from JSON schema
                            
                                How to add dynamic python modules to PyInstaller's specs?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to implement Tensorflow batch normalization in LSTM

Tags:

python

neural-network

tensorflow

lstm

rnn

BenjiBB

People also ask

1 Answers

ngovanmao

Recent Activity

Donate For Us