BERT output not deterministic

Tags:

BERT output is not deterministic. I expect the output values are deterministic when I put a same input, but my bert model the values are changing. Sounds awkwardly, the same value is returned twice, once. That is, once another value comes out, the same value comes out and it repeats. How I can make the output deterministic? let me show snippets of my code. I use the model as below.

For the BERT implementation, I use huggingface implemented BERT pytorch implementation. which is quite fameous model ri implementation in the pytorch area. [link] https://github.com/huggingface/pytorch-pretrained-BERT/

Click to copy

        tokenizer = BertTokenizer.from_pretrained(self.bert_type, do_lower_case=self.do_lower_case, cache_dir=self.bert_cache_path)
        pretrain_bert = BertModel.from_pretrained(self.bert_type, cache_dir=self.bert_cache_path)
        bert_config = pretrain_bert.config

Get the output like this

Click to copy

        all_encoder_layer, pooled_output = self.model_bert(all_input_ids, all_segment_ids, all_input_mask)

        # all_encoder_layer: BERT outputs from all layers.
        # pooled_output: output of [CLS] vec.

pooled_output

Click to copy

tensor([[-3.3997e-01,  2.6870e-01, -2.8109e-01, -2.0018e-01, -8.6849e-02,

tensor([[ 7.4340e-02, -3.4894e-03, -4.9583e-03,  6.0806e-02,  8.5685e-02,

tensor([[-3.3997e-01,  2.6870e-01, -2.8109e-01, -2.0018e-01, -8.6849e-02,

tensor([[ 7.4340e-02, -3.4894e-03, -4.9583e-03,  6.0806e-02,  8.5685e-02,

for the all encoder layer, the situation is same, - same in twice an once.

I extract word embedding feature from the bert, and the situation is same.

Click to copy

wemb_n
tensor([[[ 0.1623,  0.4293,  0.1031,  ..., -0.0434, -0.5156, -1.0220],

tensor([[[ 0.0389,  0.5050,  0.1327,  ...,  0.3232,  0.2232, -0.5383],

tensor([[[ 0.1623,  0.4293,  0.1031,  ..., -0.0434, -0.5156, -1.0220],

tensor([[[ 0.0389,  0.5050,  0.1327,  ...,  0.3232,  0.2232, -0.5383],

419

asked Jun 17 '19 23:06

Keanu Paik

Video Answer

1 Answers

Please try to set the seed. I faced the same issue and set the seed to make sure we get same values every time. One of the possible reasons could be dropout taking place in BERT.

answered Oct 17 '22 12:10

Srikant Jayaraman

Related questions
                            
                                legacy_init_op in TensorFlow Serving
                            
                                Multidimensional Input to Keras
                            
                                tf.GraphKeys.TRAINABLE_VARIABLES on output_graph.pb resulting in empty list
                            
                                Keras-vis gives following error: AttributeError: Multiple inbound nodes
                            
                                Is GEMM or BLAS used in Tensorflow, Theano, Pytorch
                            
                                Indexing the max elements in a multidimensional tensor in PyTorch
                            
                                How to experiment with custom 2d-convolution kernels in Keras?
                            
                                Passing tensorDataset or Dataloader to skorch
                            
                                How do loss functions know for which model to compute gradients in PyTorch?
                            
                                What this error means: `y` argument is not supported when using python generator as input
                            
                                What does 'Attempting to upgrade input file specified using deprecated transformation parameters' mean?
                            
                                Multiple pretrained networks in Caffe
                            
                                conv2d_transpose is dependent on batch_size when making predictions
                            
                                How filters are initialized in convnet
                            
                                Proper way to save Transfer Learning model in Keras
                            
                                Upweight a Category
                            
                                Getting precision, recall and F1 score per class in Keras
                            
                                AttributeError: 'Sequential' object has no attribute 'output_names'
                            
                                Use SMOTE to oversample image data
                            
                                Is deep learning bad at fitting simple non linear functions outside training scope (extrapolating)?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

BERT output not deterministic

Tags:

deep-learning

nlp

bert-language-model

transformer

transformer-model

Keanu Paik

People also ask

Video Answer

1 Answers

Srikant Jayaraman

Recent Activity

Donate For Us