Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Problem with using spacy.matcher.matcher.Matcher.add() method

Tags:

matcher

spacy

I am getting an error when trying to use spacy matcher:

~\Anaconda3\lib\site-packages\spacy\matcher\matcher.pyx in spacy.matcher.matcher.Matcher.add()
TypeError: add() takes exactly 2 positional arguments (3 given)

Is there any alternate function for spacy.matcher.matcher.Matcher.add()?

like image 984
Vignesh c s Avatar asked Feb 11 '21 22:02

Vignesh c s


People also ask

How spaCy Matcher works?

The Matcher lets you find words and phrases using rules describing their token attributes. Rules can refer to token annotations (like the text or part-of-speech tags), as well as lexical attributes like Token. is_punct . Applying the matcher to a Doc gives you access to the matched tokens in context.

What is rule based matching in spaCy?

spaCy features a rule-matching engine, the Matcher , that operates over tokens, similar to regular expressions. The rules can refer to token annotations (e.g. the token text or tag_ , and flags like IS_PUNCT ).

What is ruled grammar matching?

Unlike regular expression's fixed pattern matching, this helps us match token, phrases and entities of words and sentences according to some pre-set patterns along with the features such as parts-of-speech, entity types, dependency parsing, lemmatization and many more.


3 Answers

See the SpaCy Matcher.add() documentation:

Changed in v3.0
As of spaCy v3.0, Matcher.add takes a list of patterns as the second argument (instead of a variable number of arguments). The on_match callback becomes an optional keyword argument.

patterns = [[{"TEXT": "Google"}, {"TEXT": "Now"}], [{"TEXT": "GoogleNow"}]] - matcher.add("GoogleNow", on_match, *patterns) + matcher.add("GoogleNow", patterns, on_match=on_match)

Example usage:

from spacy.matcher import Matcher

matcher = Matcher(nlp.vocab)
pattern = [{"LOWER": "hello"}, {"LOWER": "world"}]
matcher.add("HelloWorld", [pattern])
doc = nlp("hello world!")
matches = matcher(doc)
like image 167
Wiktor Stribiżew Avatar answered Oct 21 '22 01:10

Wiktor Stribiżew


Instead of using matcher.add('Relation_name', None, pattern)

You can use: matcher.add('Relation_name', [pattern], on_match=None)

like image 28
mpriya Avatar answered Oct 20 '22 23:10

mpriya


In addition, if you have multiple patterns to be extracted, an example would be as below.

import spacy
nlp = spacy.load('en_core_web_sm')

from spacy.matcher import Matcher
matcher = Matcher(nlp.vocab)

pattern1 = [{'LOWER':'solarpower'}]
pattern2 = [{'LOWER':'solar'},{'IS_PUNCT':True},{'LOWER':'power'}]
pattern3 = [{'LOWER':'solar'},{'LOWER':'power'}]

matcher.add('SolarPower', [pattern1,pattern2,pattern3])
doc = nlp(u"The Solar Power industry continues to grow a solarpower increases. Solar-power is good")
found_matches = matcher(doc)


for _,start,end in found_matches:
    span = doc[start:end]
    print(span)

Output would be:

Solar Power 
solarpower 
Solar-power

like image 30
Thilee Avatar answered Oct 21 '22 01:10

Thilee