I'm new to these frameworks as well as NLP. I am following an example which gives me the following code snippet to calculate the tf-idf score of all the tokens in the tweets. However I keep getting either import errors or Vectorizer undefined.
Code:
import spacy
from textacy.vsm import Vectorizer
import textacy.vsm
vectorizer = Vectorizer(weighting = 'tfidf')
term_matrix = vectorizer.fit_transform([tok.lemma_ for tok in doc] for doc
in spacy_tweets)
Errors Recieved:
from textacy.vsm import Vectorizer
ImportError: cannot import name 'Vectorizer
//
import textacy
vectorizer = textacy.Vectorizer(weighting='tfidf')
AttributeError: module 'textacy' has no attribute 'Vectorizer'
//
import textacy
vectorizer = Vectorizer(weighting='tfidf')
NameError: name 'Vectorizer' is not defined
My Enviroment
operating system: windows 10 64bit
python version: Python 3.6.4 :: Anaconda, Inc.
spacy version: 1.9.0-np111py36_vc14_1 installed
spacy models: en_core_web_sm
textacy version: 0.3.4-py36_0
What is the correct import statement to access the textacy vectorizer class?
When using conda, version 0.3.4 of textacy is installed. This version does not have the the vectorizer. Instead install it through the PyPi project.
https://pypi.org/project/textacy/
to check if you have the vectorizer you can do the following:
In [1]: import textacy
In [2]: dir(textacy)
Out[2]:
['Corpus',
'Doc',
'TextStats',
'TopicModel',
'Vectorizer',
'__builtins__',
'__cached__',
'__doc__',
'__file__',
'__loader__',
'__name__',
'__package__',
'__path__',
'__spec__',
'__version__',
'about',
'absolute_import',
'cache',
'compat',
'constants',
'corpus',
'data_dir',
'doc',
'extract',
'io',
'load_spacy',
'logger',
'logging',
'network',
'os',
'preprocess',
'preprocess_text',
'spacy_utils',
'text_stats',
'text_utils',
'tm',
'utils',
'viz',
'vsm']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With