implementing RNN with numpy

Tags:

I'm trying to implement the recurrent neural network with numpy.

My current input and output designs are as follow:

x is of shape: (sequence length, batch size, input dimension)

h : (number of layers, number of directions, batch size, hidden size)

initial weight: (number of directions, 2 * hidden size, input size + hidden size)

weight: (number of layers -1, number of directions, hidden size, directions*hidden size + hidden size)

bias: (number of layers, number of directions, hidden size)

I have looked up pytorch API of RNN the as reference (https://pytorch.org/docs/stable/nn.html?highlight=rnn#torch.nn.RNN), but have slightly changed it to include initial weight as input. (output shapes are supposedly the same as in pytorch)

While it is running, I cannot determine whether it is behaving right, as I am inputting randomly generated numbers as input.

In particular, I am not so certain whether my input shapes are designed correctly.

Could any expert give me a guidance?

def rnn(xs, h, w0, w=None, b=None, num_layers=2, nonlinearity='tanh', dropout=0.0, bidirectional=False, training=True):
    num_directions = 2 if bidirectional else 1
    batch_size = xs.shape[1]
    input_size = xs.shape[2]
    hidden_size = h.shape[3]
    hn = []
    y = [None]*len(xs)

    for l in range(num_layers):
        for d in range(num_directions):
            if l==0 and d==0:
                wi = w0[d, :hidden_size,  :input_size].T
                wh = w0[d, hidden_size:,  input_size:].T
                wi = np.reshape(wi, (1,)+wi.shape)
                wh = np.reshape(wh, (1,)+wh.shape)
            else:
                wi = w[max(l-1,0), d, :,  :hidden_size].T
                wh = w[max(l-1,0), d, :,  hidden_size:].T
            for i,x in enumerate(xs):
                if l==0 and d==0:
                    ht = np.tanh(np.dot(x, wi) + np.dot(h[l, d], wh) + b[l, d][np.newaxis])
                    ht = np.reshape(ht,(batch_size, hidden_size)) #otherwise, shape is (bs,1,hs)
                else:
                    ht = np.tanh(np.dot(y[i], wi) + np.dot(h[l, d], wh) + b[l, d][np.newaxis])
                y[i] = ht
            hn.append(ht)
    y = np.asarray(y)
    y = np.reshape(y, y.shape+(1,))
    return np.asarray(y), np.asarray(hn)

891

asked Jul 22 '18 13:07

ytrewq

1 Answers

Regarding the shape, it probably makes sense if that's how PyTorch does it, but the Tensorflow way is a bit more intuitive - (batch_size, seq_length, input_size) - batch_size sequences of seq_length length where each element has input_size size. Both approaches can work, so I guess it's a matter of preferences.

To see whether your rnn is behaving appropriately, I'd just print the hidden state at each time step, run it on some small random data (e.g. 5 vectors, 3 elements each) and compare the results with your manual calculations.

Looking at your code, I'm unsure if it does what it's supposed to, but instead of doing this on your own based on an existing API, I'd recommend you read and try to replicate this awesome tutorial from wildml (in part 2 there's a pure numpy implementation).

answered Sep 21 '22 00:09

Dzjkb

Related questions
                            
                                PySpark: How to evaluate AUC of ML recomendation algorithm?
                            
                                A simple web page inside kivy app as a widget
                            
                                requests-like wrapper for flask's test_client
                            
                                Trouble Transferring data from FTP server to S3 via stream using Python
                            
                                How to make requests_cache work with many concurrent requests?
                            
                                How to fit a ARMA-GARCH model in python
                            
                                Keras: convert pretrained weights between theano and tensorflow
                            
                                A simple way to insert a table of contents in a multiple page pdf generated using PdfPages
                            
                                Reloading a Python module per process in the multiprocessing module
                            
                                Retrain Tensorflow final layer but still use previous Imagenet classes
                            
                                correct way to add custom (deep) copying logic to a python class
                            
                                How can I build a python project with osx environment on travis
                            
                                Scipy Sparse Cumsum
                            
                                Create dynamic parameters with pytest?
                            
                                Python: SSLError, bad handshake, Unexpected EOF
                            
                                multiprocessing.Pool spawning more processes than requested only on Google Cloud
                            
                                Why the difference in handling unbound locals in functions versus classes?
                            
                                Can't disable flask/werkzeug logging
                            
                                Handle Turkish uppercase and lowercase correctly, need to modify/override built-in functions?
                            
                                Python3 can't pickle _thread.RLock objects on list with multiprocessing

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

implementing RNN with numpy

Tags:

python

numpy

recurrent-neural-network

rnn

ytrewq

People also ask

1 Answers

Dzjkb

Recent Activity

Donate For Us