Is it possible to fine tune FastText models

Question

I'm working on a project for text similarity using FastText, the basic example I have found to train a model is:

from gensim.models import FastText

model = FastText(tokens, size=100, window=3, min_count=1, iter=10, sorted_vocab=1)

As I understand it, since I'm specifying the vector and ngram size, the model is been trained from scratch here and if the dataset is small I would spect great resutls.

The other option I have found is to load the original Wikipedia model which is a huge file:

from gensim.models.wrappers import FastText

model = FastText.load_fasttext_format('wiki.simple')

My question is, can I load the Wikipedia or any other model, and fine tune it with my dataset?

Sam H. · Accepted Answer

If you have a labelled dataset, then you should be able to fine-tune to it. This GitHub issue explains that you want to use the pretrainedVectors option. You would start with the Wikipedia pretrained vectors, then train on your dataset. It seems that gensim can do this, but according to this GH issue, there has been some bugs.

Is it possible to fine tune FastText models

Tags:

python

nlp

fasttext

Luis Ramon Ramirez Rodriguez

1 Answers

Sam H.

Recent Activity

Donate For Us

Is it possible to fine tune FastText models

Tags:

python

nlp

fasttext

Luis Ramon Ramirez Rodriguez

1 Answers

Sam H.

Related questions

Recent Activity

Donate For Us