Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keras: Use categorical_crossentropy without one-hot encoded array of targets

Tags:

python

keras

I have a Keras model that I'm using for a multi-class classification problem. I'm doing this:

model.compile(
    loss='categorical_crossentropy',
    optimizer='adam',
    metrics=['accuracy'],
)

I currently have ~100 features and there are ~2000 possible classes. One-hot encoding the class is leading to memory issues.

Is it possible to use categorical_crossentropy with this Keras model while not one-hot encoding the class labels. E.g. instead of having a target look like:

[0, 0, 0, 1, 0, 0, ...]

It would just be:

3

I looked at the source for categorical_crossentropy in Keras and it assumes two tensors of the same shape. Is there a way to get around this and use the approach I described?

Thanks!

like image 418
anon_swe Avatar asked Nov 01 '18 19:11

anon_swe


2 Answers

If your targets are one-hot encoded, use categorical_crossentropy. Examples of one-hot encodings:

[1,0,0]
[0,1,0]
[0,0,1]

However, if your targets are integers, use sparse_categorical_crossentropy. Examples of integer encodings:

1
2
3
like image 79
Kurtis Streutker Avatar answered Sep 18 '22 10:09

Kurtis Streutker


Could you post the rest of your code? by my understanding when using categorical crossentropy as loss function, the last layer should use a softmax activation function, yielding for each output neuron the probability of the input corresponding to said neuron's class, and not directly the one-hot vector. Then the categorical crossentropy is calculated as

enter image description here

where the p's are these probabilities. By just outputting the class you wouldn't have access to these probabilities and thus wouldn't be able to compute the categorical crossentropy.

like image 33
vlizana Avatar answered Sep 20 '22 10:09

vlizana