I've been trying to practise what I've learned from this tutorial:(https://realpython.com/sentiment-analysis-python/) using PyCharm.
And this line:
textcat.add_label("pos")
generated a warning: Cannot find reference 'add_label' in '(Doc) -> Doc | (Doc) -> Doc'
I understand that this is because "nlp.create_pipe()" generates a Doc not a string, but (essentially because I don't know what to do in this case!) I ran the script anyway, but then I got the an error from this line:
textcat = nlp.create_pipe("textcat", config={"architecture": "simple_cnn"})
Error msg:
raise ConfigValidationError(
thinc.config.ConfigValidationError:
Config validation error
textcat -> architecture extra fields not permitted
{'nlp': <spacy.lang.en.English object at 0x0000015E74F625E0>, 'name': 'textcat', 'architecture': 'simple_cnn', 'model': {'@architectures': 'spacy.TextCatEnsemble.v2', 'linear_model': {'@architectures': 'spacy.TextCatBOW.v1', 'exclusive_classes': True, 'ngram_size': 1, 'no_output_layer': False}, 'tok2vec': {'@architectures': 'spacy.Tok2Vec.v2', 'embed': {'@architectures': 'spacy.MultiHashEmbed.v1', 'width': 64, 'rows': [2000, 2000, 1000, 1000, 1000, 1000], 'attrs': ['ORTH', 'LOWER', 'PREFIX', 'SUFFIX', 'SHAPE', 'ID'], 'include_static_vectors': False}, 'encode': {'@architectures': 'spacy.MaxoutWindowEncoder.v2', 'width': 64, 'window_size': 1, 'maxout_pieces': 3, 'depth': 2}}}, 'threshold': 0.5, '@factories': 'textcat'}
I'm using:
Man! Did the that full spaCy upgrade really obliterate that tutorial or what...
There's a couple things you might be able to get around. I haven't fully fixed that broken tutorial. It's on the To-Do list. However, I did get around the exact issue you're having.
textcat = nlp.create_pipe("textcat", config={"architecture": "simple_cnn"})
This create_pipe
behavior has been deprecated so you can just directly add to the workflow with add_pipe
. So one thing you could do is the following:
from spacy.pipeline.textcat import single_label_cnn_config
<more good code>
nlp = spacy.load("en_core_web_trf")
if "textcat" not in nlp.pipe_names:
nlp.add_pipe('textcat', config=single_label_cnn_config, last=True)
textcat = nlp.get_pipe('textcat')
textcat.add_label("pos")
textcat.add_label("neg")
Let me know if this makes sense and helps. I'll try to revamp the tutorial entirely from spaCy in the coming weeks.
This seems to have worked with spacy 3.1.0
,
import en_core_web_md # or skip, see below
from spacy.pipeline.textcat import Config, single_label_cnn_config
nlp = en_core_web_md.load() # or nlp=spacy.load("en_core_web_sm")
config = Config().from_str(single_label_cnn_config)
if "textcat" not in nlp.pipe_names:
nlp.add_pipe('textcat', config=config, last=True)
nlp.pipe_names
# ['tok2vec', 'tagger', 'parser', 'attribute_ruler', 'lemmatizer', 'ner', 'textcat']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With