Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I train gpt 2 from scratch?

I want to train gpt 2 from scratch but there is only fine-tuning approach based on pretrained models in articles I found. I've used this https://github.com/nshepperd/gpt-2 for train with existing model. Should I edit these Python scripts to train from scratch?

like image 508
Alex Avatar asked Dec 13 '19 17:12

Alex


People also ask

Can I train GPT-2?

There is a project called Teachable NLP that you can train GPT-2 Model with your own text. It is very easy to use. You just simply upload your text file and it will train GPT-2 model automatically.

What data is GPT-2 trained on?

Both are unsupervised transformer models trained to generate text by predicting the next word in a sequence of tokens. The GPT-2 model has 1.5 billion parameters, and was trained on a dataset of 8 million web pages.

How much better is GPT-3 than GPT-2?

GPT-2 was known to have poor performance when given tasks in specialized areas such as music and storytelling. GPT-3 can now go further with tasks such as answering questions, writing essays, text summarization, language translation, and generating computer code.


1 Answers

I found the answer in 'issues' of this repo https://github.com/nshepperd/gpt-2

If you want to not use the released model at all, for instance because you want to train a model with incompatible hyperparameters, it should be sufficient to just skip the restore from the released model checkpoint (around train.py:164-177) on your first run so the parameters will all be randomly initialized.

like image 174
Alex Avatar answered Oct 25 '22 18:10

Alex