If I train an image caption model then stop to rename a few tokens:
Will either approach effect model accuracy/performance differently?
I would go for option 2.
When training the model from scratch, you are initializing the model's weights randomly and then you fit them based on your problem. However, if, instead of using random weights, you use weights that have already been trained for a similar problem, you may decrease the convergence time. This option is kind similar to the idea of transfer learning.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With