Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

spaCy Pattern Matching - OR statements

Tags:

python

spacy

is there a way to use or apply "OR" logic to pattern matcher? Something like can be done with regex? I don't want to have to create individual parsers for each occurrence, if possible (car, boat, bus, for example). I am also thinking that if I can do that, I can use a script to generate my rules. Any help would be appreciated.

Can I do something like the below, but without regex? Obviously, bus in this case might pick up other things.

import spacy
from spacy.matcher import Matcher

nlp = spacy.load("en_core_web_sm")
matcher = Matcher(nlp.vocab)

matcher.add("VEHICLE", None,
            [{"LOWER":{"REGEX":"car|boat|bus"}}]
           )

text = "I saw a car pulling a boat today, which was really funny. I also saw a bus pulling a boat."

doc = nlp(text)
matches = matcher(doc)
like image 872
scarpacci Avatar asked Apr 16 '26 04:04

scarpacci


1 Answers

If you have a list of items you want to use as patterns, create a list of dictionaries from it and pass as the third argument to matcher.add():

l = ['car', 'boat', 'bus']
patterns = [{"LOWER":x} for x in l]
matcher.add("VEHICLE", None, patterns)
>>> for _, start, end in matcher(doc):
    print(doc[start:end].text)
car
boat
bus
boat

The patterns will look like [{'LOWER': 'car'}, {'LOWER': 'boat'}, {'LOWER': 'bus'}].

like image 134
Wiktor Stribiżew Avatar answered Apr 18 '26 18:04

Wiktor Stribiżew