Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

which parser is most suitable for [biomedical] relation extraction?

I have read about continuency parser and dependency parser. but confused which could be the best choice.

my task is to extract relationship from english wikipedia text(other source may also be included later). What I need is an semantic path(with only most important information) between the two entities interesting. for instance,

form text: "In America, diabetes is, as everybody knows, a common disease."

I need the information: "diabetes is disease"

which implementation of parser would you suggest? Stanford? Maltparser? or other?

any clue is appreciated.

like image 397
Matt Avatar asked Feb 21 '23 02:02

Matt


1 Answers

You mean a syntactic parser vs a dependency parser? The online Stanford Parser shows you how these parses are different.

Syntactic Parse

(ROOT
  (S
    (PP (IN In)
      (NP (NNP America)))
    (, ,)
    (NP (NNP diabetes))
    (VP (VBZ is) (, ,)
      (PP (IN as)
        (NP (NN everybody) (NNS knows)))
      (, ,)
      (NP (DT a) (JJ common) (NN disease)))))

Dependency Parse (collapsed)

prep_in(disease-13, America-2)
nsubj(disease-13, diabetes-4)
cop(disease-13, is-5)
nn(knows-9, everybody-8)
prep_as(disease-13, knows-9)
det(disease-13, a-11)
amod(disease-13, common-12)
root(ROOT-0, disease-13)

They are not that different actually (see Collins' thesis or Nieve's book for more details) but I find dependency parses easier to work with. As you can see, you get a direct relation for diabetes -> disease. Then you can attach the copula.

like image 147
nflacco Avatar answered Apr 08 '23 13:04

nflacco