Multiclass classification or regression?

Question

I am trying to train a CNN model to classify images based on their aesthetic score. There are 2,00,000 images and every image is rated by more than 100 subjects. Mean score is calculated and the scores are normalized.

enter image description here

The distribution of the scores is approximately gaussian. So I have decided to build a 10 class classification model after assigning appropriate weight for each class as the data is imbalanced.

My question:

For this problem, the scores are continuous, ie, 0<0.2<0.3<0.4<0.5<..<1. Then does that mean this is a regression problem? If so, how do I balance the data for a regression problem, as most of the datapoints are present in between 0.4 and 0.6.

Thanks!

Sagar Dawda · Accepted Answer

Since your labels are continuous, you could divide them in to 10 equal quantiles using a technique like pandas.qcut() and provide label to each classes. This can turn a regression problem to a classification problem.

And as far as the imbalance is concerned, you may want to try to oversample the minority data. This will ensure your model is not biased towards majority data.

Hope this helps.

Multiclass classification or regression?

Tags:

classification

conv-neural-network

regression

gaussian

AKSHAYAA VAIDYANATHAN

1 Answers

Sagar Dawda

Recent Activity

Donate For Us

Multiclass classification or regression?

Tags:

classification

conv-neural-network

regression

gaussian

AKSHAYAA VAIDYANATHAN

1 Answers

Sagar Dawda

Related questions

Recent Activity

Donate For Us