What are the inputs to the transformer encoder and decoder in BERT?

Tags:

I was reading the BERT paper and was not clear regarding the inputs to the transformer encoder and decoder.

For learning masked language model (Cloze task), the paper says that 15% of the tokens are masked and the network is trained to predict the masked tokens. Since this is the case, what are the inputs to the transformer encoder and decoder?

BERT input representation (from the paper)

Is the input to the transformer encoder this input representation (see image above). If so, what is the decoder input?

Further, how is the output loss computed? Is it a softmax for only the masked locations? For this, the same linear layer is used for all masked tokens?

750

asked Feb 24 '20 19:02

mysticsasuke

1 Answers

Ah, but you see, BERT does not include a Transformer decoder. It is only the encoder part, with a classifier added on top.

For masked word prediction, the classifier acts as a decoder of sorts, trying to reconstruct the true identities of the masked words. Classifying Non-masked is not included in the classification task and does not effect loss.

BERT is also trained on predicting whether a pair of sentences really does precedes one another or not.

I do not remember how the two losses are weighted.

I hope this draws a clearer picture.

answered Oct 29 '22 06:10

user2182857

Related questions
                            
                                Where are stored wheels .whl cached files?
                            
                                How can I return self and another variable in a python class method while method chaining?
                            
                                kubeflow pipeline dynamic output list as input parameter
                            
                                Module 'pip._internal.download' has no attribute 'is_file_url'
                            
                                "Pivot" a Pandas DataFrame into a 3D numpy array
                            
                                Error in installing python package Flair, about a dependent package not hosted in PyPI
                            
                                Subregions of boolean 2d array
                            
                                how to get jupyter notebook color theme into vs code
                            
                                Copy file from pod to host by using kubernetes python client
                            
                                Identify leading and trailing NAs in pandas DataFrame
                            
                                Plot the dendrogram of communities found by NetworkX Girvan-Newman algorithm
                            
                                Round while groupping by in pandas with agg function
                            
                                Create checkerboard distribution with Python
                            
                                ImportError: cannot import name 'Serial' from 'serial' (unknown location)
                            
                                Reserved word as an attribute name in a dataclass when parsing a JSON object
                            
                                Cant create CSV file with django although already copaste from the documentation
                            
                                Multiprocessing in a loop, "Pool not running" error
                            
                                Python loses connection to MySQL database after about a day
                            
                                Python requirements conflict with PyPi
                            
                                AWS Cognito for Django3 + DRF Authentication

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What are the inputs to the transformer encoder and decoder in BERT?

Tags:

python

deep-learning

nlp

huggingface-transformers

mysticsasuke

People also ask

1 Answers

user2182857

Recent Activity

Donate For Us