I am trying to use Gensim's Word2Vec implementation. Gensim warns that if you don't have a C compiler, the training will be 70% slower. Is there away to verify that Gensim is correctly using the C Compiler I have installed?
I am using Anaconda Python 3.5 on Windows 10.
Gensim : It is an open source library in python written by Radim Rehurek which is used in unsupervised topic modelling and natural language processing. It is designed to extract semantic topics from documents. It can handle large text collections.
Gensim is a free open-source Python library for representing documents as semantic vectors, as efficiently (computer-wise) and painlessly (human-wise) as possible. Gensim is designed to process raw, unstructured digital texts (“plain text”) using unsupervised machine learning algorithms.
Gensim is billed as a Natural Language Processing package that does 'Topic Modeling for Humans'. But it is practically much more than that. It is a leading and a state-of-the-art package for processing texts, working with word vector models (such as Word2Vec, FastText etc) and for building topic models.
Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Target audience is the natural language processing (NLP) and information retrieval (IR) community.
Gensim provides both wheels and an installer for Windows.
pip install gensim
should get you gensim with Cython optimization without the work of getting Cython up and running (not that it's not great to have Cython, but sometimes it's nice to just have stuff run).
Apparently gensim offers a variable to detect this:
assert gensim.models.doc2vec.FAST_VERSION > -1
I found this line in this tutorial: https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/doc2vec-IMDB.ipynb
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With