Error when building seq2seq model with tensorflow

Tags:

I'm trying to understand the seq2seq models defined in seq2seq.py in tensorflow. I use bits of code I copy from the translate.py example that comes with tensorflow. I keep getting the same error and really do not understand where it comes from.

A minimal code example to reproduce the error:

import tensorflow as tf
from tensorflow.models.rnn import rnn_cell
from tensorflow.models.rnn import seq2seq

encoder_inputs = []
decoder_inputs = []
for i in xrange(350):  
    encoder_inputs.append(tf.placeholder(tf.int32, shape=[None],
                                              name="encoder{0}".format(i)))

for i in xrange(45):
    decoder_inputs.append(tf.placeholder(tf.int32, shape=[None],
                                         name="decoder{0}".format(i)))

model = seq2seq.basic_rnn_seq2seq(encoder_inputs,
                                  decoder_inputs,rnn_cell.BasicLSTMCell(512))

The error I get when evaluating the last line (I evaluated it interactively in the python interpreter):

    >>>  Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/tmp/py1053173el", line 12, in <module>
      File "/usr/local/lib/python2.7/dist-packages/tensorflow/models/rnn/seq2seq.py", line 82, in basic_rnn_seq2seq
        _, enc_states = rnn.rnn(cell, encoder_inputs, dtype=dtype)
      File "/usr/local/lib/python2.7/dist-packages/tensorflow/models/rnn/rnn.py", line 85, in rnn
        output_state = cell(input_, state)
      File "/usr/local/lib/python2.7/dist-packages/tensorflow/models/rnn/rnn_cell.py", line 161, in __call__
        concat = linear.linear([inputs, h], 4 * self._num_units, True)
      File "/usr/local/lib/python2.7/dist-packages/tensorflow/models/rnn/linear.py", line 32, in linear
        raise ValueError("Linear is expecting 2D arguments: %s" % str(shapes))
    ValueError: Linear is expecting 2D arguments: [[None], [None, 512]]

I suspect the error comes from my side :) On a sidenote. The documentation and the tutorials are really great but the example code for the sequence to sequence model (the english to french translation example) is quite dense. You also have to jump a lot between files to understand what's going on. Me at least got lost several times in the code.

A minimal example (perhaps on some toy data) of constructing and training a basic seq2seq model would really be helpful here. Somebody know if this already exist somewhere?

EDIT I have fixed the code above according @Ishamael suggestions (meaning, no errors returns) (see below), but there are still some things not clear in this fixed version. My input is a sequence of vectors of length 2 of real valued values. And my output is a sequence of binary vectors of length 22. Should my tf.placeholder code not look like the following? (EDIT yes)

tf.placeholder(tf.float32, shape=[None,2],name="encoder{0}".format(i))
tf.placeholder(tf.float32, shape=[None,22],name="encoder{0}".format(i))

I also had to change tf.int32 to tf.float32 above. Since my output is binary. Should this not be tf.int32 for the tf.placeholder of my decoder? But tensorflow complains again if I do this. I'm not sure what the reasoning is behind this.

The size of my hidden layer is 512 here.

the complete fixed code

import tensorflow as tf
from tensorflow.models.rnn import rnn_cell
from tensorflow.models.rnn import seq2seq

encoder_inputs = []
decoder_inputs = []
for i in xrange(350):  
    encoder_inputs.append(tf.placeholder(tf.float32, shape=[None,512],
                                          name="encoder{0}".format(i)))

for i in xrange(45):
    decoder_inputs.append(tf.placeholder(tf.float32, shape=[None,512],
                                         name="decoder{0}".format(i)))

model = seq2seq.basic_rnn_seq2seq(encoder_inputs,
                                  decoder_inputs,rnn_cell.BasicLSTMCell(512))

665

asked Nov 17 '15 17:11

user1782011

2 Answers

Most of the models (seq2seq is not an exception) expect their input to be in batches, so if the shape of your logical input is [n], then a shape of a tensor you will be using as an input to your model should be [batch_size x n]. In practice the first dimension of the shape is usually left out as None and inferred to be the batch size at runtime.

Since the logical input to seq2seq is a vector of numbers, the actual tensor shape should be [None, input_sequence_length]. So fixed code would look along the lines of:

input_sequence_length = 2; # the length of one vector in your input sequence

for i in xrange(350):  
    encoder_inputs.append(tf.placeholder(tf.int32, shape=[None, input_sequence_length],
                                              name="encoder{0}".format(i)))

(and then the same for the decoder)

answered Oct 24 '22 01:10

Ishamael

There is a self-test method in the translate module that shows its minimal usage.[here]

I just ran the self-test method using.

python translate.py --self_test 1

answered Oct 24 '22 01:10

Anurag Ranjan

Related questions
                            
                                Conditionally evaluated debug statements in Python
                            
                                Jinja's loop variable is not available in include-d templates
                            
                                'circos' style plots with matplotlib? [closed]
                            
                                Why is it possible to iterate along a string?
                            
                                Jython, use only a method from Python from Java?
                            
                                ImportError: Environment variable DJANGO_SETTINGS_MODULE is undefined
                            
                                Run shell script using fabric and piping script text to shell's stdin
                            
                                Does PEP 412 make __slots__ redundant?
                            
                                Python Django requirements.txt
                            
                                How to test exceptions with doctest in Python 2.x and 3.x?
                            
                                Drawing multiplex graphs with networkx?
                            
                                How to use USING clause in Alembic/SQLAchemy?
                            
                                Python. Redirect stdout to a socket
                            
                                Knowing an item's location in an array [duplicate]
                            
                                High numerical precision floats with MySQL and the SQLAlchemy ORM
                            
                                PostgreSQL + Python: Close connection
                            
                                multiprocessing.Process (with spawn method): which objects are inherited?
                            
                                Is there a way to compare Arabic characters without regard to their initial/medial/final form?
                            
                                Using Pandas 'categorical' dtype with sklearn
                            
                                Python module 'os' has no attribute 'mknod'

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Error when building seq2seq model with tensorflow

Tags:

python

machine-learning

neural-network

tensorflow

deep-learning

user1782011

People also ask

2 Answers

Ishamael

Anurag Ranjan

Recent Activity

Donate For Us