Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

scikit-learn classification using doc2vec representation

I want to classify text documents using doc2vec representation and scikit-learn models.

My problem is that I'm lost on how to get started. can someone explain the general steps usually taken to use doc2vec with scikit-learn?

like image 618
MikeAlbert Avatar asked Jan 06 '23 00:01

MikeAlbert


1 Answers

There is a great tutorial here for a binary classification with scikit-learn + doc2vec. In short:

  • Using gensim to train/load your doc2vec model.
  • Input text will be converted to a fixed dimension vector of floats (the same dimension as your embedding). These are the actual input features.
  • Now feel free to use any classifier in scikit-learn.
like image 53
greeness Avatar answered Jan 07 '23 12:01

greeness