How to encode dependency path as a feature for classification?

Question

I am trying to implement relation extraction between verb pairs. I want to use dependency path from one verb to the other as a feature for my classifier (predicts if relation X exists or not). But I am not sure how to encode the dependency path as a feature. Following are some example dependency paths, as space separated relation annotations from StanfordCoreNLP Collapsed Dependencies:

nsubj acl nmod:from acl nmod:by conj:and
nsubj nmod:into
nsubj acl:relcl advmod nmod:of

It is important to keep in mind that these path are of variable length and a relation could reappear without any restriction.

Two compromising ways of encoding this feature that come to my mind are:

1) Ignore the sequence, and just have one feature for each relation with its value being the number of times it appears in the path

2) Have a sliding window of length n, and have one feature for each possible pair of relations with the value being the number of times those two relations appeared consecutively. I suppose this is how one encodes n-grams. However, the number of possible relations is 50, which means I cannot really go with this approach.

Any suggestions are welcomed.

StanfordNLPHelp · Accepted Answer

We had a project that built a classifier based off of dependency paths. I asked the group member who developed the system, and he said:

indicator feature for the whole path

So if you have the training data point (verb1 -e1-> w1 -e2-> w2 -e3-> w3 -e4-> verb2, relation1) the feature would be (e1-e2-e3-e4)
And he also did ngram sequences, so for that same data point, you would also have (e1), (e2), (e3), (e4), (e1-e2), (e2-e3), (e3-e4), (e1-e2-e3), (e2-e3-e4)

He also recommended collapsing appositive edges to make the paths smaller.

Also, I should note that he developed a set of high precision rules for each relation, and used this to create a large set of training data.

How to encode dependency path as a feature for classification?

Tags:

machine-learning

nlp

stanford-nlp

information-extraction

feature-extraction

Syed Fahad Sultan

1 Answers

StanfordNLPHelp

Recent Activity

Donate For Us

How to encode dependency path as a feature for classification?

Tags:

machine-learning

nlp

stanford-nlp

information-extraction

feature-extraction

Syed Fahad Sultan

1 Answers

StanfordNLPHelp

Related questions

Recent Activity

Donate For Us