Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Semantic Role Labeling using NLTK

I have a list of sentences and I want to analyze every sentence and identify the semantic roles within that sentence. How do I do that?

I came across the PropBankCorpusReader within NLTK module that adds semantic labeling information to the Penn Treebank. Also my research on the internet suggests that this module is used to perform Semantic Role Labeling.

I am however unable to find a small HOWTO that helps me understand how we can leverage the PropBankCorpusReader to perform SRL on arbitary text.

Hence can someone point out examples of using PropbankCorpusReader to perform SRL on arbitary sentences?

like image 381
Prahalad Deshpande Avatar asked Dec 14 '13 19:12

Prahalad Deshpande


4 Answers

SRL is not at all a trivial problem, and not really something that can be done out of the box using nltk.

You can break down the task of SRL into 3 separate steps:

  1. Identifying the predicate.
  2. Performing word sense disambiguation on the predicate to determine which semantic arguments it accepts.
  3. Identifying the semantic arguments in the sentence.

Most current approaches to this problem use supervised machine learning, where the classifier would train on a subset of Propbank or FrameNet sentences and then test on the remaining subset to measure its accuracy. Researchers tend to focus on tweaking features and algorithms, as well as tinkering with whether the above steps are done sequentially or simultaneously, and in what order.

Some papers you might want to check out are:

  • Simmons (1973) - the classic SRL paper.
  • Gildea and Jurafsky (2002) - provides a simple set of features for use in classification.
  • Xue and Palmer (2004) - a more in depth look at useful features
  • Meza-Ruiz and Riedel (2009) - an interesting approach using Markov Logic.

The Markov Logic approach is promising but in my own experience it runs into severe scalability issues (I've only ever used Alchemy, though Alchemy Lite looks interesting). It's not a huge amount of work to implement some kind of classifier using the nltk Propbank data, and some off the shelf classifiers already exist in Python.

EDIT: This assignment from the University of Edinburgh gives some examples of how to parse Propbank data, and part of a school project I did implements a complete Propbank feature parser, though the features are geared specifically towards use in Markov Logic Networks in the style of Meza-Ruiz and Riedel (2009).

like image 168
cjm Avatar answered Nov 20 '22 11:11

cjm


Check out this fresh new python library (depends on NLTK) https://pypi.python.org/pypi/nlpnet/ ... it does POS and SRL.

like image 22
Pykler Avatar answered Nov 20 '22 12:11

Pykler


I'd suggest PractNLPTools which has a number of decent tools including Semantic Role Labeling.

I'm interrogating it for a work project now and it looks like it'll get the job done.

PractnlpTools: https://pypi.python.org/pypi/practnlptools/1.0

GitHub Support Site: https://github.com/biplab-iitb/practNLPTools

like image 4
ProfVersaggi Avatar answered Nov 20 '22 13:11

ProfVersaggi


As of now probably the easiest option is https://demo.allennlp.org/semantic-role-labeling. Due to the underlying transformer architecture, it comes with over 1 GB memory requirement.

like image 4
Uzay Macar Avatar answered Nov 20 '22 12:11

Uzay Macar