Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Disabling part of the nlp pipeline

I am running spaCy v2.x on a windows box with python3. I do not have admin privelages, so i have to call the pipeline as:

nlp = en_core_web_sm.load()

When I run my same script on a *nix box, I can load the pipeline as:

nlp = spacy.load('en', disable = ['ner', 'tagger', 'parser', 'textcat'])

All I am do is tokenizing, so I do not need the entire pipeline. On the windows box, if I load the pipeline like:

nlp = en_core_web_sm.load(disable = ['ner', 'tagger', 'parser', 'textcat'])

Does that actually disable the components?

spaCy information on the nlp pipeline

like image 683
Britt Avatar asked Dec 20 '18 14:12

Britt


People also ask

What is NLP pipeline?

NLP Pipeline is a set of steps followed to build an end to end NLP software. Before we started we have to remember this things pipeline is not universal, Deep Learning Pipelines are slightly different, and Pipeline is non-linear.

What does NLP () do in spaCy?

When you call nlp on a text, spaCy first tokenizes the text to produce a Doc object. The Doc is then processed in several different steps – this is also referred to as the processing pipeline. The pipeline used by the trained pipelines typically include a tagger, a lemmatizer, a parser and an entity recognizer.

What does En_core_web_sm mean?

For example, en_core_web_sm is a small English pipeline trained on written web text (blogs, news, comments), that includes vocabulary, syntax and entities.

What does NLP pipe return?

nlp. pipe returns a generator on purpose! Generators are awesome. They are more memory-friendly in that they let you iterate over a series of objects, but unlike a list, they only evaluate the next object when they need to, rather than all at once.


1 Answers

You can check the current pipeline components by

print(nlp.pipe_names)

If you are not convinced by the output, you can manually check by trying to use the component and try to print the output. E.g try to disable parser and print dependency tags.

like image 52
0x5050 Avatar answered Sep 20 '22 21:09

0x5050