Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorflow/keras: "logits and labels must have the same first dimension" How to squeeze logits or expand labels?

I'm trying to make a simple CNN classifier model. For my training images (BATCH_SIZEx227x227x1) and labels (BATCH_SIZEx7) datasets, I'm using numpy ndarrays that are fed to the model in batches via ImageDataGenerator. The loss function I'm using is tf.nn.sparse_categorical_crossentropy. The problem arises when the model tries to train; the model (batch size here is 1 for my simplified experimentations) outputs a shape of [1, 7] and labels is shape [7].

I'm almost positive I know the cause of this, but I am unsure how to fix it. My hypothesis is that sparse_categorical_crossentropy is squeezing the dimensions of my labels (e.g. when BATCH_SIZE is 2, the input, ground-truth label shape is squeezed from [2, 7] to [14]), making it impossible for me to fix the label shape, and all my attempts to fix logits shape have been fruitless.

I originally tried fixing labels shape with np.expand_dims. But the loss function always flattens the labels, no matter how I expand the dimensions.

Following that, I tried adding a tf.keras.layers.Flatten() at the end of my model to get rid of the extraneous first dimension, but it had no effect; I still got the same exact error. Following that, tried using tf.keras.layers.Reshape((-1,)) to squeeze all the dimensions. However, that resulted in a different error:

in sparse_categorical_crossentropy logits = array_ops.reshape(output, [-1, int(output_shape[-1])]) TypeError: int returned non-int (type NoneType)

Question: How can I squash the shape of the logits to be the same shape as the labels returned by the sparse_categorical_crossentropy?

 ### BUILD SHAPE OF THE MODEL ###

 model = tf.keras.Sequential([
   tf.keras.layers.Conv2D(32, (3,3), padding='same', activation=tf.nn.relu, 
                          input_shape=(227,227,1)),
   tf.keras.layers.MaxPooling2D((2,2), strides=2),
   tf.keras.layers.Conv2D(64, (3,3), padding='same', activation=tf.nn.relu),
   tf.keras.layers.MaxPooling2D((2,2), strides=2),
   tf.keras.layers.Flatten(),
   tf.keras.layers.Dense(128, activation=tf.nn.relu),
   tf.keras.layers.Dense(7, activation=tf.nn.softmax), # final layer with node for each classification
   #tf.keras.layers.Reshape((-1,))
])

# specify loss and SGD functions
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy')

### TRAIN THE MODEL ###
#specify training metadata
BATCH_SIZE = 1
print("about to train")
# train the model on the training data
num_epochs = 1 
model.fit_generator(generator.flow(train_images, train_labels, batch_size=BATCH_SIZE), epochs=num_epochs)

--- full error trace ---

Traceback (most recent call last):
  File "classifier_model.py", line 115, in <module>
    model.fit_generator(generator.flow(train_images, train_labels, batch_size=BATCH_SIZE), epochs=num_epochs)
  File "/Users/grammiegramco/Desktop/projects/HiRISE/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 1426, in fit_generator
    initial_epoch=initial_epoch)
  File "/Users/grammiegramco/Desktop/projects/HiRISE/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_generator.py", line 191, in model_iteration
    batch_outs = batch_function(*batch_data)
  File "/Users/grammiegramco/Desktop/projects/HiRISE/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 1191, in train_on_batch
    outputs = self._fit_function(ins)  # pylint: disable=not-callable
  File "/Users/grammiegramco/Desktop/projects/HiRISE/lib/python3.6/site-packages/tensorflow/python/keras/backend.py", line 3076, in __call__
    run_metadata=self.run_metadata)
  File "/Users/grammiegramco/Desktop/projects/HiRISE/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1439, in __call__
    run_metadata_ptr)
  File "/Users/grammiegramco/Desktop/projects/HiRISE/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 528, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: logits and labels must have the same first dimension, got logits shape [1,7] and labels shape [7]
     [[{{node loss/dense_1_loss/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}}]]
like image 886
ConfusedPerson Avatar asked May 25 '19 03:05

ConfusedPerson


2 Answers

No, you got the cause all wrong. You are giving one-hot encoded labels, but sparse_categorical_crossentropy expects integer labels, as it does the one-hot encoding itself (hence, sparse).

An easy solution would be to change loss to categorical_crossentropy, not the sparse version. Also note that y_true with shape (7,) is incorrect, it should be (1, 7).

like image 58
Dr. Snoopy Avatar answered Oct 18 '22 05:10

Dr. Snoopy


please consider adding a flatten layer before all the dense layers. I had the same exact issues as you and had to change from categorical_crossentropy to sparse_categorical_crossentropy. Since sprarse_categorical_crossentropy involves one-hot-encoding, your array needs to be of lesser (2D) array from the 4D array that is the output of the CNN layers.

this fixed the issue for me!

like image 2
elka Avatar answered Oct 18 '22 03:10

elka