Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to use RDF storage also as a document-oriented database?

Suppose I have a large ammount of heterogeneous JSON documents (i.e. named key-value mappings) and a hierarchy of classes (i.e. named sets) that these documents are attached to. I need to set up a data structure that will allow:

  1. CRUD operations on JSON documents.
  2. Retrieving JSON documents by ID really quickly.
  3. Retrieving all JSON documents that are attached to a certain class really quickly.
  4. Editing class hierarchy: adding/deleting classes, rearranging them.

I've initially came up with the idea of storing JSON documents in a document-oriented database (like CouchDB or MongoDB) and storing class hierarchy in an RDF storage (like 4store). 1, 2 and 4 are then figured out naturally, and 3 solved by maintaining list of attached document IDs for every class in the storage.

But then I figured that a RDF storage could actually do the document-oriented part of retrieving JSON documents by ID. At a first glance this seems true, but I'm still concerned about 2 and 3. Is there a RDF storage that is able to retrieve documents (nodes) at a speed document-oriented db's serve documents? How fast will it serve 3-like queries? I've heard a little bit about RDF storages being slow, reification problem, etc.

Is there an RDF storage that is also as comfortable for casual retrieving objects by ID, as CouchDB, for example? What is the difference between using document-oriented and RDF storage for storing, retrieving and editing JSON-like objects?

like image 608
martinthenext Avatar asked Nov 30 '11 20:11

martinthenext


1 Answers

You originally asked this question for graph databases (like Neo4j). That's why I'd like to add some notes.

  1. Graph databases use integrated indexing for nodes (and relationships) so the fast initial lookup for the root nodes of your documents is done via that (external or in graph indexes)
  2. Additional in graph indexes for paths (actually trees to the root) can be modelled cleaner that just a key-value lookup)
  3. If you model your documents as trees of nodes with properties you can do any simple, and complex CRUD operations (also structural)
  4. retrieving all documents of a "type" or "class" can again be done by a index (index root nodes to type) or in graph category nodes
  5. you can put those "types or class" category-nodes into a hierarchy (or graph) which then can be edited using the usual graph database API
  6. traversing the graph can be done using traversers / integrated graph query language (e.g. cypher for Neo4j)
  7. Loading hierarchical data can either be done by custom importers or a more general sub-graph importer (e.g. GEOFF)
like image 172
Michael Hunger Avatar answered Dec 25 '22 19:12

Michael Hunger