I am getting an error when trying to use spacy matcher
:
~\Anaconda3\lib\site-packages\spacy\matcher\matcher.pyx in spacy.matcher.matcher.Matcher.add()
TypeError: add() takes exactly 2 positional arguments (3 given)
Is there any alternate function for spacy.matcher.matcher.Matcher.add()
?
The Matcher lets you find words and phrases using rules describing their token attributes. Rules can refer to token annotations (like the text or part-of-speech tags), as well as lexical attributes like Token. is_punct . Applying the matcher to a Doc gives you access to the matched tokens in context.
spaCy features a rule-matching engine, the Matcher , that operates over tokens, similar to regular expressions. The rules can refer to token annotations (e.g. the token text or tag_ , and flags like IS_PUNCT ).
Unlike regular expression's fixed pattern matching, this helps us match token, phrases and entities of words and sentences according to some pre-set patterns along with the features such as parts-of-speech, entity types, dependency parsing, lemmatization and many more.
See the SpaCy Matcher.add()
documentation:
Changed in v3.0
As of spaCy v3.0,Matcher.add
takes a list of patterns as the second argument (instead of a variable number of arguments). Theon_match
callback becomes an optional keyword argument.
patterns = [[{"TEXT": "Google"}, {"TEXT": "Now"}], [{"TEXT": "GoogleNow"}]]
- matcher.add("GoogleNow", on_match, *patterns)
+ matcher.add("GoogleNow", patterns, on_match=on_match)
Example usage:
from spacy.matcher import Matcher
matcher = Matcher(nlp.vocab)
pattern = [{"LOWER": "hello"}, {"LOWER": "world"}]
matcher.add("HelloWorld", [pattern])
doc = nlp("hello world!")
matches = matcher(doc)
Instead of using matcher.add('Relation_name', None, pattern)
You can use: matcher.add('Relation_name', [pattern], on_match=None)
In addition, if you have multiple patterns to be extracted, an example would be as below.
import spacy
nlp = spacy.load('en_core_web_sm')
from spacy.matcher import Matcher
matcher = Matcher(nlp.vocab)
pattern1 = [{'LOWER':'solarpower'}]
pattern2 = [{'LOWER':'solar'},{'IS_PUNCT':True},{'LOWER':'power'}]
pattern3 = [{'LOWER':'solar'},{'LOWER':'power'}]
matcher.add('SolarPower', [pattern1,pattern2,pattern3])
doc = nlp(u"The Solar Power industry continues to grow a solarpower increases. Solar-power is good")
found_matches = matcher(doc)
for _,start,end in found_matches:
span = doc[start:end]
print(span)
Solar Power
solarpower
Solar-power
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With