is there a way to use or apply "OR" logic to pattern matcher? Something like can be done with regex? I don't want to have to create individual parsers for each occurrence, if possible (car, boat, bus, for example). I am also thinking that if I can do that, I can use a script to generate my rules. Any help would be appreciated.
Can I do something like the below, but without regex? Obviously, bus in this case might pick up other things.
import spacy
from spacy.matcher import Matcher
nlp = spacy.load("en_core_web_sm")
matcher = Matcher(nlp.vocab)
matcher.add("VEHICLE", None,
[{"LOWER":{"REGEX":"car|boat|bus"}}]
)
text = "I saw a car pulling a boat today, which was really funny. I also saw a bus pulling a boat."
doc = nlp(text)
matches = matcher(doc)
If you have a list of items you want to use as patterns, create a list of dictionaries from it and pass as the third argument to matcher.add():
l = ['car', 'boat', 'bus']
patterns = [{"LOWER":x} for x in l]
matcher.add("VEHICLE", None, patterns)
>>> for _, start, end in matcher(doc):
print(doc[start:end].text)
car
boat
bus
boat
The patterns will look like [{'LOWER': 'car'}, {'LOWER': 'boat'}, {'LOWER': 'bus'}].
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With