How to provide multiple targets to a Seq2Seq model?

Benchmark:

Model 1: One caption for each video

I trained it on 1108 sport videos, with a batch size of 5, over 60 epochs. This configuration takes about 211 seconds per epochs.

Epoch 1/60 ; Batch loss: 5.185806 ; Batch accuracy: 14.67% ; Test accuracy: 17.64%
Epoch 2/60 ; Batch loss: 4.453338 ; Batch accuracy: 18.51% ; Test accuracy: 20.15%
Epoch 3/60 ; Batch loss: 3.992785 ; Batch accuracy: 21.82% ; Test accuracy: 54.74%
...
Epoch 10/60 ; Batch loss: 2.388662 ; Batch accuracy: 59.83% ; Test accuracy: 58.30%
...
Epoch 20/60 ; Batch loss: 1.228056 ; Batch accuracy: 69.62% ; Test accuracy: 52.13%
...
Epoch 30/60 ; Batch loss: 0.739343; Batch accuracy: 84.27% ; Test accuracy: 51.37%
...
Epoch 40/60 ; Batch loss: 0.563297 ; Batch accuracy: 85.16% ; Test accuracy: 48.61%
...
Epoch 50/60 ; Batch loss: 0.452868 ; Batch accuracy: 87.68% ; Test accuracy: 56.11%
...
Epoch 60/60 ; Batch loss: 0.372100 ; Batch accuracy: 91.29% ; Test accuracy: 57.51%

Model 2: 12 captions for each video

Then I trained the same 1108 sport videos, with a batch size of 64.
This configuration takes about 470 seconds per epochs.

Since I've 12 captions for each videos, the total number of samples in my dataset is 1108*12.
That's why I took this batch size (64 ~= 12*old_batch_size). So the two models launch the optimizer the same number of times.

Epoch 1/60 ; Batch loss: 5.356736 ; Batch accuracy: 09.00% ; Test accuracy: 20.15%
Epoch 2/60 ; Batch loss: 4.435441 ; Batch accuracy: 14.14% ; Test accuracy: 57.79%
Epoch 3/60 ; Batch loss: 4.070400 ; Batch accuracy: 70.55% ; Test accuracy: 62.52%
...
Epoch 10/60 ; Batch loss: 2.998837 ; Batch accuracy: 74.25% ; Test accuracy: 68.07%
...
Epoch 20/60 ; Batch loss: 2.253024 ; Batch accuracy: 78.94% ; Test accuracy: 65.48%
...
Epoch 30/60 ; Batch loss: 1.805156 ; Batch accuracy: 79.78% ; Test accuracy: 62.09%
...
Epoch 40/60 ; Batch loss: 1.449406 ; Batch accuracy: 82.08% ; Test accuracy: 61.10%
...
Epoch 50/60 ; Batch loss: 1.180308 ; Batch accuracy: 86.08% ; Test accuracy: 65.35%
...
Epoch 60/60 ; Batch loss: 0.989979 ; Batch accuracy: 88.45% ; Test accuracy: 63.45%

Here is the intuitive representation of my datasets:

Model 1 and Model 2

How can I interprete this results ?

When I manually looked at the test predictions, Model 2 predictions looked more accurate than Model 1 ones.

In addition, I used a batch size of 64 for Model 2. That means that I could obtain even more good results by choosing a smaller batch size. It seems I can't have better training method for Mode 1 since batch size is already very low

On the other hand, Model 1 have better loss and training accuracy results...

What should I conclude ?
Does the Model 2 constantly overwrites the previously trained captions with the new ones instead of adding new possible captions ?

241

asked Aug 12 '19 02:08

wakobu

1 Answers

Not sure if i understand this correctly since i only worked with neural networks like yolo but here is what i understand: You are training a network to caption videos and now you want train several captions per video right? I guess the problem is that you are overwriting your previously trained captions with the new ones instead of adding new possible captions.

You need to train all possible captions from the start, not sure if this is supported with your network architecture though. Getting this to work properly is a bit complex because you would need to compare your output to all possible captions. Also you probably need to use the 20 most likely captions as output instead of just one to get the best possible result. I´m afraid i can´t do more than offering this thought because i wasn´t able to find a good source.

151

answered Oct 06 '22 00:10

ItsMeTheBee

Related questions
                            
                                Python SQL connection error (2006, 'SSL connection error: SSL_CTX_set_tmp_dh failed')
                            
                                ModuleNotFoundError because PySpark serializer is not able to locate library folder
                            
                                AWS Sagemaker - ClientError: Data download failed
                            
                                Why does this code print a different result between Windows and Linux?
                            
                                How to add an encircling axes around a polar plot?
                            
                                RuntimeError: _thnn_mse_loss_forward is not implemented for type torch.cuda.LongTensor
                            
                                Regex to Match mRNA Sequences
                            
                                ImportError: cannot import name '_counter' from 'Crypto.Util'
                            
                                Cannot import PyOpenCL in Juypter Notebook
                            
                                In Python, why do warnings not appear when using `eval`?
                            
                                Splitting a list with strings and nested lists of strings into a flat list
                            
                                Why is substring searching using 'in' operator, faster than using KMP algorithm?
                            
                                PyLaTeX: pylatex.errors.CompilerError: No LaTex compiler was found
                            
                                Getting Python package distribution version from within a package
                            
                                Using Panda's .at function to modify multiple rows
                            
                                Python pytest pytest_exception_interact customize exception information from VCR.py exception
                            
                                How to hide command prompt popup during launching PyLatex or Latexmk
                            
                                How to document options in an INI file with Sphinx
                            
                                Recommendation system with matrix factorization for huge data gives MemoryError

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to provide multiple targets to a Seq2Seq model?

Tags:

python

tensorflow

deep-learning

recurrent-neural-network