Keras: Big one-hot-encoding: binary_crossentropy or categorical_crossentropy

Question

I'm training a text classification model where the input data consists of 4096 term frequency–inverse document frequencies.

My output are 416 possible categories. Each piece of data has 3 categories, so there are 3 ones in an array of 413 zeros (one-hot-encodings)

My model looks like this:

model = Sequential()
model.add(Dense(2048, activation="relu", input_dim=X.shape[1]))
model.add(Dense(512, activation="relu"))
model.add(Dense(416, activation="sigmoid"))

When I train it with the binary_crossentropy loss, it has a loss of 0.185 and an accuracy of 96% after one epoch. After 5 epochs, the loss is at 0.037 and the accuracy at 99.3%. I guess this is wrong, since there are a lot of 0s in my labels, which it classifies correctly.

When I train it with the categorical_crossentropy loss, it has a loss of 15.0 and an accuracy of below 5% in the first few epochs, before it gets stuck at a loss of 5.0 and an accuracy of 12% after several (over 50) epochs.

Which one of those would be right for my situation (large one-hot-encodings with multiple 1s)? What do these scores tell me?

EDIT: These are the model.compile() statement:

model.compile(loss='categorical_crossentropy',
              optimizer=keras.optimizers.Adam(),
              metrics=['accuracy'])

and

model.compile(loss='binary_crossentropy',
              optimizer=keras.optimizers.Adam(),
              metrics=['accuracy'])

desertnaut · Accepted Answer

In short: the (high) accuracy reported when you use loss='binary_crossentropy' is not the correct one, as you already have guessed. For your problem, the recommended loss is categorical_crossentropy.

In long:

The underlying reason for this behavior is a rather subtle & undocumented issue at how Keras actually guesses which accuracy to use, depending on the loss function you have selected, when you include simply metrics=['accuracy'] in your model compilation, as you have. In other words, while your first compilation option

model.compile(loss='categorical_crossentropy',
          optimizer=keras.optimizers.Adam(),
          metrics=['accuracy']

is valid, your second one:

model.compile(loss='binary_crossentropy',
          optimizer=keras.optimizers.Adam(),
          metrics=['accuracy'])

will not produce what you expect, but the reason is not the use of binary cross entropy (which, at least in principle, is an absolutely valid loss function).

Why is that? If you check the metrics source code, Keras does not define a single accuracy metric, but several different ones, among them binary_accuracy and categorical_accuracy. What happens under the hood is that, since you have selected loss='binary_crossentropy' and have not specified a particular accuracy metric, Keras (wrongly...) infers that you are interested in the binary_accuracy, and this is what it returns - while in fact you are interested in the categorical_accuracy.

Let's verify that this is the case, using the MNIST CNN example in Keras, with the following modification:

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])  # WRONG way

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=2,  # only 2 epochs, for demonstration purposes
          verbose=1,
          validation_data=(x_test, y_test))

# Keras reported accuracy:
score = model.evaluate(x_test, y_test, verbose=0) 
score[1]
# 0.9975801164627075

# Actual accuracy calculated manually:
import numpy as np
y_pred = model.predict(x_test)
acc = sum([np.argmax(y_test[i])==np.argmax(y_pred[i]) for i in range(10000)])/10000
acc
# 0.98780000000000001

score[1]==acc
# False

Arguably, the verification of the above behavior with your own data should be straightforward.

And just for the completeness of the discussion, if, for whatever reason, you insist in using binary cross entropy as your loss function (as I said, nothing wrong with this, at least in principle) while still getting the categorical accuracy required by the problem at hand, you should ask explicitly for categorical_accuracy in the model compilation as follows:

from keras.metrics import categorical_accuracy
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=[categorical_accuracy])

In the MNIST example, after training, scoring, and predicting the test set as I show above, the two metrics now are the same, as they should be:

# Keras reported accuracy:
score = model.evaluate(x_test, y_test, verbose=0) 
score[1]
# 0.98580000000000001

# Actual accuracy calculated manually:
y_pred = model.predict(x_test)
acc = sum([np.argmax(y_test[i])==np.argmax(y_pred[i]) for i in range(10000)])/10000
acc
# 0.98580000000000001

score[1]==acc
# True

System setup:

Python version 3.5.3
Tensorflow version 1.2.1
Keras version 2.0.4

Keras: Big one-hot-encoding: binary_crossentropy or categorical_crossentropy

Tags:

python

machine-learning

neural-network

deep-learning

keras

Tutanchamunon

1 Answers

desertnaut

Recent Activity

Donate For Us

Keras: Big one-hot-encoding: binary_crossentropy or categorical_crossentropy

Tags:

python

machine-learning

neural-network

deep-learning

keras

Tutanchamunon

1 Answers

desertnaut

Related questions

Recent Activity

Donate For Us