I have 10 classes naming them 0 to 9
The output would look something like this
[0.0, 0.75, 0.0, 1.0, 0.0, 0.875, 0.0, 0.0, 0.0]
The above actual label is labeled in such a way that the 3rd index, which is also class 3 is Rank one and the 5th class is rank two, class 1 is ranked 3 and other classes are not relevant, so rank zero for the rest of them. In another word, the largest number has the highest rank, and so on. My main focus is on the ranks themselves, and I don't place importance on the specific values, such as 0.75 etc, that correspond to each rank.
Approach 1 - Regression
Last Dense layer with 10 neurons with linear activation function and loss as keras.losses.MeanSquaredError().
My model is predicting mostly zero as that is the majority rank
Approach 2 - Multiclass classification
Last Dense layer with 10 neurons with softmax activation function and loss as keras.losses.CategoricalCrossentropy(). With this approach, we can sort normalized predictions and put a threshold, and below that threshold all rank zero. I am only getting Rank 1 rightly but other ranks are squashed down.
Approach 3 - Linear + cosine similarity
I am having Linear Activation function and cosine similarity as a loss function. Here I see most cosine similarity in training and validation while training is very good all are above 0.9 which means cosine similarity is doing good gradient decent but my downstream task of ranking is not working I just get the rank 1 and the last rank right rest all ranks are wrong.
I want to know what is the right activation function and loss function for this problem. Any custom loss function for ranking ?
There is a whole tensorflow module devoted to ranking problems that you might draw inspiration from. For your examples, perhaps look at:
Mean Squared Loss
Sigmoid Cross Entropy Loss
(there doesn't seem to be an analogous loss for your cosine similarity metric)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With