I have been trying to find how to get the dependency tree with spaCy but I can't find anything on how to get the tree, only on how to navigate the tree.

In case someone wants to easily view the dependency tree produced by spacy, one solution would be to convert it to an <code>nltk.tree.Tree</code> and use the <code>nltk.tree.Tree.pretty_print</code> method. Here is an example: <pre class="prettyprint"><code>import spacy from nltk import Tree en_nlp = spacy.load('en') doc = en_nlp("The quick brown fox jumps over the lazy dog.") def to_nltk_tree(node): if node.n_lefts + node.n_rights > 0: return Tree(node.orth_, [to_nltk_tree(child) for child in node.children]) else: return node.orth_ [to_nltk_tree(sent.root).pretty_print() for sent in doc.sents] </code></pre> Output: <pre class="prettyprint"><code> jumps ________________|____________ | | | | | over | | | | | | | | | | | dog | | | | | ___|____ The quick brown fox . the lazy </code></pre> <hr> Edit: For changing the token representation you can do this: <pre class="prettyprint"><code>def tok_format(tok): return "_".join([tok.orth_, tok.tag_]) def to_nltk_tree(node): if node.n_lefts + node.n_rights > 0: return Tree(tok_format(node), [to_nltk_tree(child) for child in node.children]) else: return tok_format(node) </code></pre> Which results in: <pre class="prettyprint"><code> jumps_VBZ __________________________|___________________ | | | | | over_IN | | | | | | | | | | | dog_NN | | | | | _______|_______ The_DT quick_JJ brown_JJ fox_NN ._. the_DT lazy_JJ </code></pre>

How to get the dependency tree with spaCy?

2 Answers

The tree isn't an object in itself; you just navigate it via the relationships between tokens. That's why the docs talk about navigating the tree, but not 'getting' it.

First, let's parse some text to get a Doc object:

>>> import spacy >>> nlp = spacy.load('en_core_web_sm') >>> doc = nlp('First, I wrote some sentences. Then spaCy parsed them. Hooray!')

doc is a Sequence of Token objects:

>>> doc[0] First >>> doc[1] , >>> doc[2] I >>> doc[3] wrote

But it doesn't have a single root token. We parsed a text made up of three sentences, so there are three distinct trees, each with their own root. If we want to start our parsing from the root of each sentence, it will help to get the sentences as distinct objects, first. Fortunately, doc exposes these to us via the .sents property:

>>> sentences = list(doc.sents) >>> for sentence in sentences: ...     print(sentence) ...  First, I wrote some sentences. Then spaCy parsed them. Hooray!

Each of these sentences is a Span with a .root property pointing to its root token. Usually, the root token will be the main verb of the sentence (although this may not be true for unusual sentence structures, such as sentences without a verb):

>>> for sentence in sentences: ...     print(sentence.root) ...  wrote parsed Hooray

With the root token found, we can navigate down the tree via the .children property of each token. For instance, let's find the subject and object of the verb in the first sentence. The .dep_ property of each child token describes its relationship with its parent; for instance a dep_ of 'nsubj' means that a token is the nominal subject of its parent.

>>> root_token = sentences[0].root >>> for child in root_token.children: ...     if child.dep_ == 'nsubj': ...         subj = child ...     if child.dep_ == 'dobj': ...         obj = child ...  >>> subj I >>> obj sentences

We can likewise keep going down the tree by viewing one of these token's children:

>>> list(obj.children) [some]

Thus with the properties above, you can navigate the entire tree. If you want to visualise some dependency trees for example sentences to help you understand the structure, I recommend playing with displaCy.

answered Sep 22 '22 16:09

Mark Amery

In case someone wants to easily view the dependency tree produced by spacy, one solution would be to convert it to an nltk.tree.Tree and use the nltk.tree.Tree.pretty_print method. Here is an example:

import spacy from nltk import Tree   en_nlp = spacy.load('en')  doc = en_nlp("The quick brown fox jumps over the lazy dog.")  def to_nltk_tree(node):     if node.n_lefts + node.n_rights > 0:         return Tree(node.orth_, [to_nltk_tree(child) for child in node.children])     else:         return node.orth_   [to_nltk_tree(sent.root).pretty_print() for sent in doc.sents]

Output:

                jumps                     ________________|____________           |    |     |     |    |      over       |    |     |     |    |       |          |    |     |     |    |      dog        |    |     |     |    |    ___|____     The quick brown  fox   .  the      lazy

Edit: For changing the token representation you can do this:

def tok_format(tok):     return "_".join([tok.orth_, tok.tag_])   def to_nltk_tree(node):     if node.n_lefts + node.n_rights > 0:         return Tree(tok_format(node), [to_nltk_tree(child) for child in node.children])     else:         return tok_format(node)

Which results in:

                         jumps_VBZ                               __________________________|___________________                |       |        |         |      |         over_IN           |       |        |         |      |            |               |       |        |         |      |          dog_NN           |       |        |         |      |     _______|_______      The_DT quick_JJ brown_JJ   fox_NN  ._. the_DT         lazy_JJ

113

answered Sep 24 '22 16:09

Christos Baziotis

Related questions
                            
                                How to launch python Idle from a virtual environment (virtualenv)
                            
                                How do I properly set the Datetimeindex for a Pandas datetime object in a dataframe?
                            
                                How do I check if keras is using gpu version of tensorflow?
                            
                                Getting the date of 7 days ago from current date in python [closed]
                            
                                AttributeError: Module Pip has no attribute 'main'
                            
                                How can I start ipython running a script?
                            
                                pandas read_csv index_col=None not working with delimiters at the end of each line
                            
                                How do I remove the background from this kind of image?
                            
                                Python: read all text file lines in loop
                            
                                no module named urllib.parse (How should I install it?)
                            
                                Testing if a pandas DataFrame exists
                            
                                Django - How to do tuple unpacking in a template 'for' loop
                            
                                How to find out if Python is compiled with UCS-2 or UCS-4?
                            
                                How can I install the Python library 'gevent' on Mac OS X Lion
                            
                                Python: How to use RegEx in an if statement?
                            
                                What do these operators mean (** , ^ , %, //)? [closed]
                            
                                Getting Google Spreadsheet CSV into A Pandas Dataframe
                            
                                How to check all versions of python installed on osx and centos
                            
                                Convert opencv image format to PIL image format?
                            
                                Installing PIL (Python Imaging Library) in Win7 64 bits, Python 2.6.4

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to get the dependency tree with spaCy?

Tags:

python

spacy

Nicolas Joseph

People also ask

2 Answers

Mark Amery

Christos Baziotis

Recent Activity

Donate For Us