Why does my keras LSTM model get stuck in an infinite loop?

Tags:

I am trying to build a small LSTM that can learn to write code (even if it's garbage code) by training it on existing Python code. I have concatenated a few thousand lines of code together in one file across several hundred files, with each file ending in <eos> to signify "end of sequence".

As an example, my training file looks like:


setup(name='Keras',
...
      ],
      packages=find_packages())
<eos>
import pyux
...
with open('api.json', 'w') as f:
    json.dump(sign, f)
<eos>

I am creating tokens from the words with:

file = open(self.textfile, 'r')
filecontents = file.read()
file.close()
filecontents = filecontents.replace("\n\n", "\n")
filecontents = filecontents.replace('\n', ' \n ')
filecontents = filecontents.replace('    ', ' \t ')

text_in_words = [w for w in filecontents.split(' ') if w != '']

self._words = set(text_in_words)
    STEP = 1
    self._codelines = []
    self._next_words = []
    for i in range(0, len(text_in_words) - self.seq_length, STEP):
        self._codelines.append(text_in_words[i: i + self.seq_length])
        self._next_words.append(text_in_words[i + self.seq_length])

My keras model is:

model = Sequential()
model.add(Embedding(input_dim=len(self._words), output_dim=1024))

model.add(Bidirectional(
    LSTM(128), input_shape=(self.seq_length, len(self._words))))

model.add(Dropout(rate=0.5))
model.add(Dense(len(self._words)))
model.add(Activation('softmax'))

model.compile(loss='sparse_categorical_crossentropy',
              optimizer="adam", metrics=['accuracy'])

But no matter how much I train it, the model never seems to generate <eos> or even \n. I think it might be because my LSTM size is 128 and my seq_length is 200, but that doesn't quite make sense? Is there something I'm missing?

260

asked May 19 '19 19:05

Shamoon

1 Answers

Sometimes, when there is no limit for code generation or the <EOS> or <SOS> tokens are not numerical tokens LSTM never converges. If you could send your outputs or error messages, it would be much easier to debug.

You could create an extra class for getting words and sentences.

# tokens for start of sentence(SOS) and end of sentence(EOS)

SOS_token = 0
EOS_token = 1


class Lang:
    '''
    class for word object, storing sentences, words and word counts.
    '''
    def __init__(self, name):
        self.name = name
        self.word2index = {}
        self.word2count = {}
        self.index2word = {0: "SOS", 1: "EOS"}
        self.n_words = 2  # Count SOS and EOS

    def addSentence(self, sentence):
        for word in sentence.split(' '):
            self.addWord(word)

    def addWord(self, word):
        if word not in self.word2index:
            self.word2index[word] = self.n_words
            self.word2count[word] = 1
            self.index2word[self.n_words] = word
            self.n_words += 1
        else:
            self.word2count[word] += 1

Then, while generating text, just adding a <SOS> token would do. You can use https://github.com/sherjilozair/char-rnn-tensorflow , a character level rnn for reference.

answered Oct 04 '22 04:10

ASHu2

Related questions
                            
                                Get ordered list of attributes of a Python module
                            
                                Difference in sequence of query generated in Django and Postgres for select_for_update
                            
                                Sending OpenCV output to VLC stream
                            
                                Pandas Design Considerations for MultiIndexed Dataframes
                            
                                setdefault vs defaultdict performance
                            
                                Can I append to a compressed stream with pandas?
                            
                                How to install packages/modules in IronPython
                            
                                PYTHONPATH order on Ubuntu 14.04
                            
                                PyTest-Django Failing on missing django_migration table
                            
                                Does Python have an equivalent to Haskell's 'mask' or 'bracket' functions?
                            
                                Training of keras model get's slower after each repetition
                            
                                Computing the "closure" of the attributes of an object given functions that change the attributes
                            
                                Is there a way to use tensorflow map_fn on GPU?
                            
                                Keras custom loss implementation : ValueError: An operation has `None` for gradient
                            
                                Jenkins Job - DatabaseError: file is encrypted or is not a database
                            
                                install caffe on mac " Error: invalid option: --with-python"
                            
                                Missing elements when using selenium chrome driver to automatically 'Save as PDF'
                            
                                Tensorflow 2.0 Keras is training 4x slower than 2.0 Estimator
                            
                                How to stop Pandas DataFrame from converting int to float for no reason?
                            
                                Why would I get a Forbidden message from AWS API Gateway, even though things are working internally?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why does my keras LSTM model get stuck in an infinite loop?

Tags:

python

neural-network

tensorflow

keras

lstm

Shamoon

People also ask

1 Answers

ASHu2

Recent Activity

Donate For Us