Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

BERT output not deterministic

BERT output is not deterministic. I expect the output values are deterministic when I put a same input, but my bert model the values are changing. Sounds awkwardly, the same value is returned twice, once. That is, once another value comes out, the same value comes out and it repeats. How I can make the output deterministic? let me show snippets of my code. I use the model as below.

For the BERT implementation, I use huggingface implemented BERT pytorch implementation. which is quite fameous model ri implementation in the pytorch area. [link] https://github.com/huggingface/pytorch-pretrained-BERT/

        tokenizer = BertTokenizer.from_pretrained(self.bert_type, do_lower_case=self.do_lower_case, cache_dir=self.bert_cache_path)
        pretrain_bert = BertModel.from_pretrained(self.bert_type, cache_dir=self.bert_cache_path)
        bert_config = pretrain_bert.config

Get the output like this

        all_encoder_layer, pooled_output = self.model_bert(all_input_ids, all_segment_ids, all_input_mask)

        # all_encoder_layer: BERT outputs from all layers.
        # pooled_output: output of [CLS] vec.

pooled_output

tensor([[-3.3997e-01,  2.6870e-01, -2.8109e-01, -2.0018e-01, -8.6849e-02,

tensor([[ 7.4340e-02, -3.4894e-03, -4.9583e-03,  6.0806e-02,  8.5685e-02,

tensor([[-3.3997e-01,  2.6870e-01, -2.8109e-01, -2.0018e-01, -8.6849e-02,

tensor([[ 7.4340e-02, -3.4894e-03, -4.9583e-03,  6.0806e-02,  8.5685e-02,

for the all encoder layer, the situation is same, - same in twice an once.

I extract word embedding feature from the bert, and the situation is same.

wemb_n
tensor([[[ 0.1623,  0.4293,  0.1031,  ..., -0.0434, -0.5156, -1.0220],

tensor([[[ 0.0389,  0.5050,  0.1327,  ...,  0.3232,  0.2232, -0.5383],

tensor([[[ 0.1623,  0.4293,  0.1031,  ..., -0.0434, -0.5156, -1.0220],

tensor([[[ 0.0389,  0.5050,  0.1327,  ...,  0.3232,  0.2232, -0.5383],
like image 419
Keanu Paik Avatar asked Jun 17 '19 23:06

Keanu Paik


People also ask

Is Bert deterministic?

Bert (sentence classification) output is non-deterministic(have checked previous issue, SET model.

What is the output of the Bert model?

The output of the BERT is the hidden state vector of pre-defined hidden size corresponding to each token in the input sequence. These hidden states from the last layer of the BERT are then used for various NLP tasks.

How many parameters does a Bert model have?

BERT's Architecture BERT Base: 12 layers (transformer blocks), 12 attention heads, and 110 million parameters.


Video Answer


1 Answers

Please try to set the seed. I faced the same issue and set the seed to make sure we get same values every time. One of the possible reasons could be dropout taking place in BERT.

like image 55
Srikant Jayaraman Avatar answered Oct 17 '22 12:10

Srikant Jayaraman