I'm trying to build a character-level auto-encoder for text sequences using Keras. When I compile the model, I get an error concerning the shapes of the tensors as you can see bellow. I printed out the layers specifications to check, if the tensor shapes match up and I found that the problem is probably with the last Lambda layer not having an Output Tensor Shape specified correctly, but I can't figure out why not or how to specify it and didn't find anything about it in the documentation of Keras or on Google.
Below the error output is also the part of the code, where I define my model. The whole script for clarification, if needed, is here: PasteBin.
(Mind mainly the last layer.)
0 <keras.engine.topology.InputLayer object at 0x7f5d290eb588> Input shape (None, 80) Output shape (None, 80)
1 <keras.layers.core.Lambda object at 0x7f5d35f25a20> Input shape (None, 80) Output shape (None, 80, 99)
2 <keras.layers.core.Dense object at 0x7f5d2dda52e8> Input shape (None, 80, 99) Output shape (None, 80, 256)
3 <keras.layers.core.Dropout object at 0x7f5d25004da0> Input shape (None, 80, 256) Output shape (None, 80, 256)
4 <keras.layers.core.Dense object at 0x7f5d2501ac18> Input shape (None, 80, 256) Output shape (None, 80, 128)
5 <keras.layers.core.Dense object at 0x7f5d24dc6cc0> Input shape (None, 80, 128) Output shape (None, 80, 64)
6 <keras.layers.core.Dense object at 0x7f5d24de1fd0> Input shape (None, 80, 64) Output shape (None, 80, 128)
7 <keras.layers.core.Dropout object at 0x7f5d24df4a20> Input shape (None, 80, 128) Output shape (None, 80, 128)
8 <keras.layers.core.Dense object at 0x7f5d24dfeb38> Input shape (None, 80, 128) Output shape (None, 80, 256)
9 <keras.layers.core.Lambda object at 0x7f5d24da6a20> Input shape (None, 80, 256) Output shape (None, 80)
----------------
0 Input Tensor("input_1:0", shape=(?, 80), dtype=int64) Output Tensor("input_1:0", shape=(?, 80), dtype=int64)
1 Input Tensor("input_1:0", shape=(?, 80), dtype=int64) Output Tensor("ToFloat:0", shape=(?, 80, 99), dtype=float32)
2 Input Tensor("ToFloat:0", shape=(?, 80, 99), dtype=float32) Output Tensor("Relu:0", shape=(?, 80, 256), dtype=float32)
3 Input Tensor("Relu:0", shape=(?, 80, 256), dtype=float32) Output Tensor("cond/Merge:0", shape=(?, 80, 256), dtype=float32)
4 Input Tensor("cond/Merge:0", shape=(?, 80, 256), dtype=float32) Output Tensor("Relu_1:0", shape=(?, 80, 128), dtype=float32)
5 Input Tensor("Relu_1:0", shape=(?, 80, 128), dtype=float32) Output Tensor("Relu_2:0", shape=(?, 80, 64), dtype=float32)
6 Input Tensor("Relu_2:0", shape=(?, 80, 64), dtype=float32) Output Tensor("Relu_3:0", shape=(?, 80, 128), dtype=float32)
7 Input Tensor("Relu_3:0", shape=(?, 80, 128), dtype=float32) Output Tensor("cond_1/Merge:0", shape=(?, 80, 128), dtype=float32)
8 Input Tensor("cond_1/Merge:0", shape=(?, 80, 128), dtype=float32) Output Tensor("truediv:0", shape=(?, 80, 256), dtype=float32)
9 Input Tensor("truediv:0", shape=(?, 80, 256), dtype=float32) Output Tensor("ToFloat_1:0", shape=(), dtype=float32)
----------------
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/tensor_shape.py", line 578, in merge_with
self.assert_same_rank(other)
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/tensor_shape.py", line 624, in assert_same_rank
"Shapes %s and %s must have the same rank" % (self, other))
ValueError: Shapes (?, ?) and () must have the same rank
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/ops/nn_impl.py", line 153, in sigmoid_cross_entropy_with_logits
labels.get_shape().merge_with(logits.get_shape())
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/tensor_shape.py", line 585, in merge_with
(self, other))
ValueError: Shapes (?, ?) and () are not compatible
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "lstm.py", line 97, in <module>
autoencoder.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
File "/usr/local/lib/python3.4/dist-packages/keras/engine/training.py", line 667, in compile
sample_weight, mask)
File "/usr/local/lib/python3.4/dist-packages/keras/engine/training.py", line 318, in weighted
score_array = fn(y_true, y_pred)
File "/usr/local/lib/python3.4/dist-packages/keras/objectives.py", line 45, in binary_crossentropy
return K.mean(K.binary_crossentropy(y_pred, y_true), axis=-1)
File "/usr/local/lib/python3.4/dist-packages/keras/backend/tensorflow_backend.py", line 2449, in binary_crossentropy
logits=output)
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/ops/nn_impl.py", line 156, in sigmoid_cross_entropy_with_logits
% (logits.get_shape(), labels.get_shape()))
ValueError: logits and labels must have the same shape (() vs (?, ?))
I've built my model using this code:
def binarize(x, sz):
return tf.to_float(tf.one_hot(x, sz, on_value=1, off_value=0, axis=-1))
def binarize_outputshape(in_shape):
return in_shape[0], in_shape[1], len(chars)
def debinarize(x):
return tf.to_float(np.argmax(x)) # get the character with most probability
def debinarize_outputshape(in_shape):
return in_shape[0], in_shape[1]
input_sentence = Input(shape=(max_title_len,), dtype='int64')
# make one-hot vectors out of sentences
one_hot = Lambda(binarize, output_shape=binarize_outputshape, arguments={'sz': len(chars)})(input_sentence)
# shape: max_title_len * chars = 80 * 55 = 4400
encoder = Dense(256, activation='relu')(one_hot)
encoder = Dropout(0.1)(encoder)
encoder = Dense(128, activation='relu')(encoder)
encoder = Dense(64, activation='relu')(encoder)
decoder = Dense(128, activation='relu')(encoder)
encoder = Dropout(0.1)(encoder)
decoder = Dense(256, activation='softmax')(decoder)
# transform back from one-hot vectors
decoder = Lambda(debinarize, output_shape=debinarize_outputshape)(decoder)
autoencoder = Model(input=input_sentence, output=decoder)
Firstly, I input a text sequence of maximum 80 chars, the Lambda layer transforms each character to a one-hot vector. At the end, I would like to transform the one-hot vectors back, while taking only the maximum as the decoded character.
As Nassim Ben pointed out, the problem was with the function debinarize. After changing it to:
def debinarize(x):
return tf.to_float(tf.argmax(x, axis=0))
at least some kind of value was set to the Output tensor's shape. Though the value is a bit weird, because it is (80, 256) and differs from the output shape (None, 80). All the other output tensor shapes and output shapes match up correspondingly (I suppose the '?' and None mean the same more or less...). More specifically, the Lambda layer now looks like this:
<keras.layers.core.Lambda object at 0x7fafcc5a59b0> Input shape (None, 80, 256) Output shape (None, 80)
...
...
Input Tensor("truediv:0", shape=(?, 80, 256), dtype=float32) Output Tensor("ToFloat_1:0", shape=(80, 256), dtype=float32)
Problem is, I would like output tensor shape to be (?, 80) as is the first layer's input. I did not change any other code than the argmax.
The error given is now:
Traceback (most recent call last):
File "lstm.py", line 122, in <module>
callbacks=[earlystop_cb, check_cb, keras.callbacks.TensorBoard(log_dir='/tmp/autoencoder')])
File "/usr/local/lib/python3.4/dist-packages/keras/engine/training.py", line 1168, in fit
self._make_train_function()
File "/usr/local/lib/python3.4/dist-packages/keras/engine/training.py", line 760, in _make_train_function
self.total_loss)
File "/usr/local/lib/python3.4/dist-packages/keras/optimizers.py", line 433, in get_updates
m_t = (self.beta_1 * m) + (1. - self.beta_1) * g
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/ops/math_ops.py", line 883, in binary_op_wrapper
y = ops.convert_to_tensor(y, dtype=x.dtype.base_dtype, name="y")
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/ops.py", line 651, in convert_to_tensor
as_ref=False)
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/ops.py", line 716, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/constant_op.py", line 176, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/constant_op.py", line 165, in constant
tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape))
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/tensor_util.py", line 360, in make_tensor_proto
raise ValueError("None values not supported.")
ValueError: None values not supported.
I think it comes from using a numpy function on a tensor. Try using tf argmax function (I think the axis you want to reduce is 1, not sure)
def debinarize(x):
return tf.to_float(tf.argmax(x,axis=2)) # get the character with most probability
Does this work?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With