I am trying to find correct parts of speech for each word in paragraph. I am using Stanford POS Tagger. However, I am stuck at a point.
I want to identify prepositions from the paragraph.
Penn Treebank Tagset says that:
IN Preposition or subordinating conjunction
how, can I be sure if current word is be preposition or subordinating conjunction. How can I extract only prepositions from paragraph in this case?
You can't be sure. The reason for this somewhat strange PoS is that it's really hard to automatically determine if, for example, for is a preposition or a subordinate conjunction. So in order for automatic taggers to have a better precision, this distinction is simply ignored. Note that there is also a tag TO
, which is given to any occurrence of to, regardless of its function as a preposition, infinitive particle or whatever (I think there are others).
If you need to identify prepositions properly, you need to retrain a tagger with a modified tag set, or maybe train a classifier which takes PoS-tagged text and only does this final disambiguation.
I have had some breakthrough to understand if the word is actually preposition or subordinating conjunction.
I have parsed following sentence :
She left early because Mike arrived with his new girlfriend.
(here because is subordinating conjunction )
After POS tagging
She_PRP left_VBD early_RB because_IN Mike_NNP arrived_VBD with_IN his_PRP$ new_JJ girlfriend_NN ._.
here , to make sure because is a preposition or not I have parsed the sentence.
here because has direct parent after IN as SBAR(Subordinate Clause) as root.
with also comes under IN but its direct parent will be PP so it is a preposition.
Example 2 :
Keep your hand on the wound until the nurse asks you to take it off. (here until is coordinating conjunction )
POS tagging is :
Keep_VB your_PRP$ hand_NN on_IN the_DT wound_NN until_IN the_DT nurse_NN asks_VBZ you_PRP to_TO take_VB it_PRP off_RP ._.
So , until and on are marked as IN.
However, picture gets clearer when we actually parse the sentence.
So finally I conclude because is subordinating conjunction and with is preposition.
Tried for many variations of sentences .. worked for almost all except some cases for before and after.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With