Sentiment words behave very differently when under the semantic scope of negation. I want to use a slightly modified version of Das and Chen (2001) They detect words such as no, not, and never and then append a "neg"-suffix to every word appearing between a negation and a clause-level punctuation mark. I want to create something similar with dependency parsing from spaCy.
import spacy
from spacy import displacy
nlp = spacy.load('en')
doc = nlp(u'$AAPL is óóóóópen to ‘Talk’ about patents with GOOG definitely not the treatment #samsung got:-) heh')
options = {'compact': True, 'color': 'black', 'font': 'Arial'}
displacy.serve(doc, style='dep', options=options)
Visualized dependency paths:
Nicely, there exists a negation modifier in the dependency tag scheme; NEG
In order to identify negation I use the following:
negation = [tok for tok in doc if tok.dep_ == 'neg']
Now I want to retrieve the scope of the negations.
import spacy
from spacy import displacy
import pandas as pd
nlp = spacy.load("en_core_web_sm")
doc = nlp(u'AAPL is óóóóópen to Talk about patents with GOOG definitely not the treatment got')
print('DEPENDENCY RELATIONS')
print('Key: ')
print('TEXT, DEP, HEAD_TEXT, HEAD_POS, CHILDREN')
for token in doc:
print(token.text, token.dep_, token.head.text, token.head.pos_,
[child for child in token.children])
This gives the following output:
DEPENDENCY RELATIONS
Key:
TEXT, DEP, HEAD_TEXT, HEAD_POS, CHILDREN
AAPL nsubj is VERB []
is ROOT is VERB [AAPL, óóóóópen, got]
óóóóópen acomp is VERB [to]
to prep óóóóópen ADJ [Talk]
Talk pobj to ADP [about, definitely]
about prep Talk NOUN [patents]
patents pobj about ADP [with]
with prep patents NOUN [GOOG]
GOOG pobj with ADP []
definitely advmod Talk NOUN []
not neg got VERB []
the det treatment NOUN []
treatment nsubj got VERB [the]
got conj is VERB [not, treatment]
How to filter out only the token.head.text of not, so got
and it's locating?
Can someone help me out?
Dependency Parsing Using spaCyIt defines the dependency relationship between headwords and their dependents. The head of a sentence has no dependency and is called the root of the sentence. The verb is usually the head of the sentence. All other words are linked to the headword.
spaCy features a fast and accurate syntactic dependency parser, and has a rich API for navigating the tree. The parser also powers the sentence boundary detection, and lets you iterate over base noun phrases, or “chunks”. You can check whether a Doc object has been parsed by calling doc.
[43] In the scheme used by spaCy, prepositions are referred to as “adposition” and use a tag ADP. Words like “Friday” or “Obama” are tagged with PROPN, which stands for “proper nouns” reserved for names of known individuals, places, time references, organizations, events and such.
The term Dependency Parsing (DP) refers to the process of examining the dependencies between the phrases of a sentence in order to determine its grammatical structure. A sentence is divided into many sections based mostly on this.
You can simply define and loop through the head tokens of the negation tokens you found:
negation_tokens = [tok for tok in doc if tok.dep_ == 'neg']
negation_head_tokens = [token.head for token in negation_tokens]
for token in negation_head_tokens:
print(token.text, token.dep_, token.head.text, token.head.pos_, [child for child in token.children])
which prints you the information for got
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With