Going through the NLTK book, it's not clear how to generate a dependency tree from a given sentence. The relevant section of the book: sub-chapter on dependency grammar gives an example figure but it doesn't show how to parse a sentence to come up with those relationships - or maybe I'm missing something fundamental in NLP? EDIT: I want something similar to what the stanford parser does: Given a sentence "I shot an elephant in my sleep", it should return something like: <pre class="prettyprint"><code>nsubj(shot-2, I-1) det(elephant-4, an-3) dobj(shot-2, elephant-4) prep(shot-2, in-5) poss(sleep-7, my-6) pobj(in-5, sleep-7) </code></pre>

If you need better performance, then spacy (https://spacy.io/) is the best choice. Usage is very simple: <pre class="prettyprint"><code>import spacy nlp = spacy.load('en') sents = nlp(u'A woman is walking through the door.') </code></pre> You'll get a dependency tree as output, and you can dig out very easily every information you need. You can also define your own custom pipelines. See more on their website. https://spacy.io/docs/usage/

How do I do dependency parsing in NLTK?

Tags:

python

nlp

grammar

nltk

Going through the NLTK book, it's not clear how to generate a dependency tree from a given sentence.

The relevant section of the book: sub-chapter on dependency grammar gives an example figure but it doesn't show how to parse a sentence to come up with those relationships - or maybe I'm missing something fundamental in NLP?

EDIT: I want something similar to what the stanford parser does: Given a sentence "I shot an elephant in my sleep", it should return something like:

nsubj(shot-2, I-1)
det(elephant-4, an-3)
dobj(shot-2, elephant-4)
prep(shot-2, in-5)
poss(sleep-7, my-6)
pobj(in-5, sleep-7)

864

asked Sep 16 '11 10:09

MrD

3 Answers

We can use Stanford Parser from NLTK.

Requirements

You need to download two things from their website:

The Stanford CoreNLP parser.
Language model for your desired language (e.g. english language model)

Warning!

Make sure that your language model version matches your Stanford CoreNLP parser version!

The current CoreNLP version as of May 22, 2018 is 3.9.1.

After downloading the two files, extract the zip file anywhere you like.

Python Code

Next, load the model and use it through NLTK

from nltk.parse.stanford import StanfordDependencyParser

path_to_jar = 'path_to/stanford-parser-full-2014-08-27/stanford-parser.jar'
path_to_models_jar = 'path_to/stanford-parser-full-2014-08-27/stanford-parser-3.4.1-models.jar'

dependency_parser = StanfordDependencyParser(path_to_jar=path_to_jar, path_to_models_jar=path_to_models_jar)

result = dependency_parser.raw_parse('I shot an elephant in my sleep')
dep = result.next()

list(dep.triples())

Output

The output of the last line is:

[((u'shot', u'VBD'), u'nsubj', (u'I', u'PRP')),
 ((u'shot', u'VBD'), u'dobj', (u'elephant', u'NN')),
 ((u'elephant', u'NN'), u'det', (u'an', u'DT')),
 ((u'shot', u'VBD'), u'prep', (u'in', u'IN')),
 ((u'in', u'IN'), u'pobj', (u'sleep', u'NN')),
 ((u'sleep', u'NN'), u'poss', (u'my', u'PRP$'))]

I think this is what you want.

107

answered Oct 17 '22 16:10

ywat

I think you could use a corpus-based dependency parser instead of the grammar-based one NLTK provides.

Doing corpus-based dependency parsing on a even a small amount of text in Python is not ideal performance-wise. So in NLTK they do provide a wrapper to MaltParser, a corpus based dependency parser.

You might find this other question about RDF representation of sentences relevant.

answered Oct 17 '22 16:10

Neodawn

If you need better performance, then spacy (https://spacy.io/) is the best choice. Usage is very simple:

import spacy

nlp = spacy.load('en')
sents = nlp(u'A woman is walking through the door.')

You'll get a dependency tree as output, and you can dig out very easily every information you need. You can also define your own custom pipelines. See more on their website.

https://spacy.io/docs/usage/

answered Oct 17 '22 17:10

Aleksandar Jovanovic

Related questions
                            
                                Django template convert to string
                            
                                Does ImageDataGenerator add more images to my dataset?
                            
                                Choose list variable given probability of each variable
                            
                                How to invert a permutation array in numpy
                            
                                How to delete a file by extension in Python?
                            
                                Handling GET and POST in same Flask view
                            
                                Download a folder from S3 using Boto3
                            
                                Is there any legitimate use of list[True], list[False] in Python?
                            
                                Fetch all href link using selenium in python
                            
                                Running Tensorflow in Jupyter Notebook
                            
                                How can I get the screen size in Tkinter?
                            
                                Imputation of missing values for categories in pandas
                            
                                Compute pairwise distance in a batch without replicating tensor in Tensorflow?
                            
                                Merge a list of pandas dataframes
                            
                                Difference between "__method__" and "method" [duplicate]
                            
                                get script directory name - Python [duplicate]
                            
                                Temporarily Disabling Django Caching
                            
                                sklearn : TFIDF Transformer : How to get tf-idf values of given words in document
                            
                                Write a Pandas DataFrame to Google Cloud Storage or BigQuery
                            
                                Is it possible to list all functions in a module? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With