Understanding Character Level Embedding in Keras LSTM

Tags:

I am a newbie in implementation of language models in Keras RNN structures. I have a dataset of discrete words (not from a single paragraph) that have the following statistics,

Total word samples: 1953
Total number of Distinct Characters: 33 (including START,END and *)
Maximum length (number of characters) in a word is 10

Now, I want to build a model that will accept a character and predict the next character in the word. I have padded all the words so that they have same length. So my input is Word_input with shape 1953 x 9 and target is 1953 x 9 x 33. I also want to use Embedding layer. So my network architecture is,

    self.wordmodel=Sequential()
    self.wordmodel.add(Embedding(33,embedding_size,input_length=9))
    self.wordmodel.add(LSTM(128, return_sequences=True))
    self.wordmodel.add(TimeDistributed(Dense(33)))
    self.wordmodel.compile(loss='mse',optimizer='rmsprop',metrics=['accuracy'])

As an example a word "CAT" with padding represents

Input to Network -- START C A T END * * * * (9 Characters)

Target of the same --- C A T END * * * * *(9 Characters)

So with the TimeDistributed output I am measuring the difference of network prediction and target. I have also set the batch_size to 1. So that after reading every sample word the network reset its state.

My question is am I doing it conceptually right? Whenever I am running my training the accuracy is stuck about 56%.

Kindly enlighten me. Thanks.

356

asked Jun 16 '17 09:06

Parthosarathi Mukherjee

1 Answers

In my knowledge, the structure is basic and may work to some degree. I have some suggestions

In the TimeDistributed layer, you should add an activation function softmax which is wide employed in multi-classification. And now in your structure, the output is non-limited and it's not intuitive as your target is just one-hot.
With softmax function, you could change the loss to cross-entropy which increase the probability of correct class and decrease the others. It's more appropriate.

you can take a try. For more useful model, you could try following structure which is given in Pytorch tutorial. Thanks.

enter image description here

150

answered Nov 14 '22 21:11

danche

Related questions
                            
                                How to reduce wand memory usage?
                            
                                calling classmethod using 'self'?
                            
                                Django ManyToManyField exclude
                            
                                Datatypes, data shapes, and pad_sequences
                            
                                Why is the import of `*.so` files from ZIP files disallowed in Python?
                            
                                Pandas array to columns
                            
                                google.protobuf.text_format.ParseError when instantiating a TensorFlow model with Python
                            
                                Pandas combine two group by's, filter and merge the groups(counts)
                            
                                Get parent page on creating new Wagtail Page
                            
                                Click: Is it possible to pass multiple inputs to CliRunner.invoke?
                            
                                pprint with custom float formats
                            
                                Django: File "manage.py", line 10, in <module> execute_from_command_line(sys.argv)
                            
                                Select specific columns in NumPy array using colon notation
                            
                                docker-compose not setting environment variables with flask
                            
                                How to process data before storing to database in python eve
                            
                                What is the meaning of <cycle 5> function in the output of cProfile analyzed using KchacheGrind?
                            
                                Pick up lines from a file based on line numbers in another file
                            
                                structures with functions and python ctypes
                            
                                Keras Neural Network Error: Setting an Array Element with a Sequence
                            
                                Inheritance and inner classes in Python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Understanding Character Level Embedding in Keras LSTM

Tags:

python

keras

lstm

embedding

language-model

Parthosarathi Mukherjee

People also ask

1 Answers

danche

Recent Activity

Donate For Us