Identify Location Within the Sentence where the Missing Word Belongs

Question

I have the code below:

import nltk
exampleArray = ['The dog barking']

def processLanguage():
    for item in exampleArray:
        tokenized = nltk.word_tokenize(item)
        tagged = nltk.pos_tag(tokenized)
        print(tagged)

processLanguage()

The output of the code above are the tokenized words with their corresponding parts of speech. Example :

[('The', 'DT'), ('dog', 'NN'), ('barking', 'NN'), ('.', '.')]

DT = determiner
NN = noun

The text is supposed to be

The dog is barking

and supposed to have the POS sequence of

DT -> NN -> VBZ -> VBG

VBZ = verb, present tense, 3rd person singular
VBG = verb, present participle or gerund

How will I make the program locate within the sentence the position of the missing word?

CLpragmatics · Accepted Answer

This is straight-foward grammar checking. You need at least a tagger, a tool which annotates part of speech tagging (POS), and a parser, best something like Early parser (https://en.wikipedia.org/wiki/Earley_parser) or something else, which is capable of analysing the tree structure given a phrase structure grammar (PSG) of your target language. Indifferent to what specific algorithm you choose, always keep in mind that natural language is at least weakly context-sensitive in the chosmky hierarchy, so forget about finite state automatons etc. If the parser does not validate your sentence as grammatical (in linguistic terms its not licensed by your PSG), you may use the tree structure to locate the position which is not employed or incorrectly employed by some terminal symbol. Another additional thing you have to do is morphological and case-marking, which allows for checking faults in agreement of verbs and arguments etc. in order to rule out sentences like "the dog are barking". Maybe also have a look at LFG or HPSG implementations, which realize this in a more thorough way, since they are computationally more powerful (context-sensitive tools, in other words a linear bounded turing machine).

Identify Location Within the Sentence where the Missing Word Belongs

Tags:

python

nlp

nltk

part-of-speech

pos-tagger

alyssaeliyah

1 Answers

CLpragmatics

Recent Activity

Donate For Us

Identify Location Within the Sentence where the Missing Word Belongs

Tags:

python

nlp

nltk

part-of-speech

pos-tagger

alyssaeliyah

1 Answers

CLpragmatics

Related questions

Recent Activity

Donate For Us