Looking to find out whether a sentence includes an imperative within it (e.g. categorize "click below" as an imperative, whereas "here is some information" as not).
Is this possible with e.g. the Stanford Parser? For reference, the main site (http://nlp.stanford.edu/software/lex-parser.shtml) indicates 'Improved recognition of imperatives', however the dependency manual does not indicate a filed for them http://nlp.stanford.edu/software/dependencies_manual.pdf )
Alternatively, is there another approach which would work?
An imperative sentence is a sentence that expresses a direct command, request, invitations, warning, or instruction. Imperative sentences do not have a subject; instead, a directive is given to an implied second person. For example, the sentence “Wash the dinner plates” commands the implied subject to wash the dishes.
The term imperative is used in a number of ways in the linguistics literature. In one use, imperative is a semantic modality. Imperatives are directives conveying an illocutionary force of commanding, prohibiting, suggesting, permitting, or requesting by the speaker.
The imperative mood in English is generally used to give an order, to prompt someone to do something, to give a warning or to give instructions. There are several distinguishable forms of the imperative in English: affirmative, negative, and exhortative, as well as the more cordial ways of expressing an order.
An imperative sentence is one which is used to express a command/order or request and also to give an instruction or some advice. Imperative sentences do not require a subject.
I also failed to find any library or literature that (directly) addresses 'imperative detection' (there must be a different official name for it...). Here's what I've come up with by reading up on the grammar of imperatives, learning about chunking and some experimentation.
(Python + NLTK)
from nltk import RegexpParser
from nltk.tree import Tree
def is_imperative(tagged_sent):
# if the sentence is not a question...
if tagged_sent[-1][0] != "?":
# catches simple imperatives, e.g. "Open the pod bay doors, HAL!"
if tagged_sent[0][1] == "VB" or tagged_sent[0][1] == "MD":
return True
# catches imperative sentences starting with words like 'please', 'you',...
# E.g. "Dave, stop.", "Just take a stress pill and think things over."
else:
chunk = get_chunks(tagged_sent)
# check if the first chunk of the sentence is a VB-Phrase
if type(chunk[0]) is Tree and chunk[0].label() == "VB-Phrase":
return True
# Questions can be imperatives too, let's check if this one is
else:
# check if sentence contains the word 'please'
pls = len([w for w in tagged_sent if w[0].lower() == "please"]) > 0
# catches requests disguised as questions
# e.g. "Open the doors, HAL, please?"
if pls and (tagged_sent[0][1] == "VB" or tagged_sent[0][1] == "MD"):
return True
chunk = get_chunks(tagged_sent)
# catches imperatives ending with a Question tag
# and starting with a verb in base form, e.g. "Stop it, will you?"
elif type(chunk[-1]) is Tree and chunk[-1].label() == "Q-Tag":
if (chunk[0][1] == "VB" or
(type(chunk[0]) is Tree and chunk[0].label() == "VB-Phrase")):
return True
return False
# chunks the sentence into grammatical phrases based on its POS-tags
def get_chunks(tagged_sent):
chunkgram = r"""VB-Phrase: {<DT><,>*<VB>}
VB-Phrase: {<RB><VB>}
VB-Phrase: {<UH><,>*<VB>}
VB-Phrase: {<UH><,><VBP>}
VB-Phrase: {<PRP><VB>}
VB-Phrase: {<NN.?>+<,>*<VB>}
Q-Tag: {<,><MD><RB>*<PRP><.>*}"""
chunkparser = RegexpParser(chunkgram)
return chunkparser.parse(tagged_sent)
Haven't tested the performance of the algorithm yet, though from my observations I'd say precision is probably better than recall. Note that the performance greatly depends on the correctness of the POS-tags.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With