Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RDF representation of sentences

I need to represent sentences in RDF format.

In other words "John likes coke" would be automatically represented as:

Subject : John
Predicate : Likes
Object : Coke

Does anyone know where I should start? Are there any programs which can do this automatically or would I need to do everything from scratch?

like image 464
Lilz Avatar asked Apr 24 '10 19:04

Lilz


People also ask

Can RDF use XML for its syntax?

Definition: An RDF Document is a serialization of an RDF Graph into a concrete syntax. Definition: An RDF/XML Document is an RDF Document written in the XML syntax for RDF as defined in this document.

What are the fundamental concepts of RDF?

An RDF triple is conventionally written in the order subject, predicate, object. The set of nodes of an RDF graph is the set of subjects and objects of triples in the graph. It is possible for a predicate IRI to also occur as a node in the same graph. IRIs, literals and blank nodes are collectively known as RDF terms .

What is RDF triple explain?

RDF Triple is an actual expression that defines a way in which you can represent a relationship between objects. There are three parts to a triple: Subject, Predicate and Object (typically written in the same order). A predicate relates subject to object.


2 Answers

It looks like you want the typed dependencies of a sentence, e.g. for John likes coke:

 nsubj(likes-2, John-1)
 dobj(likes-2, coke-3)

I'm not aware of any dependency parser that directly produces RDF. However, many of them produce parses in a standardized tab limited representation known as CoNLL-X, and it shouldn't be too hard to convert from CoNLL-X to RDF.

Open Source Dependency parsers

There are a number of parsers to choose from that extract typed dependencies, including the following state-of-art open source options:

  • Stanford Parser - see online demo.
  • MaltParser
  • MSTParser

The Stanford Parser includes a pre-trained model for parsing English. To get typed dependencies you'll need to use the flag -outputFormat typedDependencies.

For the MaltParser you can download an English model here.

The MSTParser includes a small 200 sentence English training set that you can use to create you're own English parsing model. However, training on this little data will hurt the accuracy of the resulting parser. So, if you decide to use this parser, you are probably better off using the pretrain model available here.

All of the pretrained models linked above produce parses according to the Stanford Dependency formalism (ACL paper, and manual).

Of these three, the Stanford Parser is the most accurate. The MaltParser is the fastest, with some configurations of this package being able to parse 1800 sentences in only 8 seconds.

like image 59
dmcer Avatar answered Oct 06 '22 23:10

dmcer


One option is to use output from Link Parser, available under a GPL-compatible license. You can define a translation layer between these outputs and your RDF nodes as needed.

Check out this demo on your "John likes coke" example!

like image 38
Bosh Avatar answered Oct 06 '22 23:10

Bosh