I have a Keras model that I'm using for a multi-class classification problem. I'm doing this:
model.compile(
loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'],
)
I currently have ~100 features and there are ~2000 possible classes. One-hot encoding the class is leading to memory issues.
Is it possible to use categorical_crossentropy
with this Keras model while not one-hot encoding the class labels. E.g. instead of having a target look like:
[0, 0, 0, 1, 0, 0, ...]
It would just be:
3
I looked at the source for categorical_crossentropy
in Keras and it assumes two tensors of the same shape. Is there a way to get around this and use the approach I described?
Thanks!
If your targets are one-hot encoded, use categorical_crossentropy
.
Examples of one-hot encodings:
[1,0,0]
[0,1,0]
[0,0,1]
However, if your targets are integers, use sparse_categorical_crossentropy
.
Examples of integer encodings:
1
2
3
Could you post the rest of your code? by my understanding when using categorical crossentropy as loss function, the last layer should use a softmax activation function, yielding for each output neuron the probability of the input corresponding to said neuron's class, and not directly the one-hot vector. Then the categorical crossentropy is calculated as
where the p
's are these probabilities. By just outputting the class you wouldn't have access to these probabilities and thus wouldn't be able to compute the categorical crossentropy.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With