Good day!
I have a celebrity dataset on which I want to fine-tune a keras built-in model. SO far what I have explored and done, we remove the top layers of the original model (or preferably, pass the include_top=False) and add our own layers, and then train our newly added layers while keeping the previous layers frozen. This whole thing is pretty much like intuitive.
Now what I require is, that my model learns to identify the celebrity faces, while also being able to detect all the other objects it has been trained on before. Originally, the models trained on imagenet come with an output layer of 1000 neurons, each representing a separate class. I'm confused about how it should be able to detect the new classes? All the transfer learning and fine-tuning articles and blogs tell us to replace the original 1000-neuron output layer with a different N-neuron layer (N=number of new classes). In my case, I have two celebrities, so if I have a new layer with 2 neurons, I don't know how the model is going to classify the original 1000 imagenet objects.
I need a pointer on this whole thing, that how exactly can I have a pre-trained model taught two new celebrity faces while also maintaining its ability to recognize all the 1000 imagenet objects as well.
Thanks!
CNN's are prone to forgetting the previously learned knowledge when retrained for a new task on a novel domain and this phenomenon is often called catastrophic forgetting, which is an active and challenging research domain.
Coming to the point, one obvious way to enable a model to classify new classes along with old classes is to train from scratch on the accumulated (old+new) dataset (which is time consuming.)
In contrast, several alternative approaches have been proposed in the literature of (class-incremental) continual learning to tackle this scenario in the recent years:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With