Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Any alternatives to Virtuoso as a graph store? [closed]

I like it (very much) that is supports SPARQL/Update and the SPARQL endpoint that comes with it, but

  • I'm a little worried about vendor lock in
  • I think it is overkill for my requirements (I want a graph store with half a billion triples)
  • I would love to use an open-source and free product instead

So far I couldn't find any descent and comparable products (commercial or otherwise). They pretty much look immature or experimental to me. Ideas ?

like image 341
Ashkan Kh. Nazary Avatar asked Feb 14 '11 05:02

Ashkan Kh. Nazary


People also ask

Does virtuoso support GraphQL and Knowledge Graph?

It supports also SQL JDBC access to Knowledge Graph and GraphQL over SPARQL. Virtuoso is a multi-model hybrid-RDBMS that supports management of data represented as relational tables and/or property graphs Providers of DBaaS offerings, please contact us to be listed.

What is the difference between virtuoso and GraphDB enterprise?

GraphDB Enterprise is a high-performance semantic repository created by Ontotext.... Virtuoso is a modern multi-model RDBMS for managing data represented as tabular relations... GraphDB allows you to link text and data in big knowledge graphs. It’s easy to experiment...

What is the best tool for big knowledge graphs?

GraphDB allows you to link text and data in big knowledge graphs. It’s easy to experiment... Performance & Scale — as exemplified by DBpedia and the LOD Cloud it spawned, i.e.,...


3 Answers

What you might be looking for is http://4store.org/ and you might also try searching for questions very like this over on http://www.semanticoverflow.com/ (link is defunct)

like image 128
dajobe Avatar answered Sep 28 '22 09:09

dajobe


Having used a lot of different Triple Stores as storage layers in my research project I would recommend the following two:

  • 4store - Already mentioned by dajobe and is very good and has frequent releases to fix bugs and add new features as SPARQL 1.1 continues to be standardised. Also has benefit of being totally free
  • AllegroGraph - Free for up to 50 million Triples though tends to be be quite a RAM hog even at relatively low numbers of Triples (e.g. used around 3 of my 4GB of RAM when I had about 1.5m triples). Actual memory usage will vary with usage - in my case I was running an app that meant my entire dataset had to be loaded into memory. I haven't used Version 4 so I can't say whether they have improved this

While Virtuoso is very good at some things it has a seriously bad case of feature creep and has a lot of non-standard/proprietary features which like you imply might lead to vendor lock in.

Like Ian says stick to using the core language features in the SPARQL Standards and then you can easily move to a different Triple Store as your needs change. When developing your application try and design it to be storage agnostic so you can just plug in a different storage layer as your need to. How easy this is to do will depend on your programming environment/language/API but doing it will be beneficial in the long run.

like image 21
RobV Avatar answered Sep 28 '22 10:09

RobV


  • I'm a little worried about vendor lock in

OpenLink Software (my employer) works very hard to implement open standards and specifications where they exist and are sufficient. We add extensions, and document that we've done so, when necessary -- as with the aggregate and other analytics functions which were not part of SPARQL 1.0, but are part of SPARQL 1.1 and/or will be part of SPARQL 2.0.

If you stick with the published standards, you won't be locked in. If you need the extensions, we think we're not so much locking you in as enabling and empowering you... but your mileage may vary.

  • I think it is overkill for my requirements (I want a graph store with half a billion triples)

By all means, consider all the functionality you need when making your decision. But it seems likely to me that you'll be doing more than storing your triples. Queries, reasoning, query optimization, Federated SPARQL (joins against other remote SPARQL endpoints, formerly known as SPARQL-FED), and other functionality may not be so much overkill as simply not-yet-needed.

It's worth noting that Virtuoso can be run in a minimized form (LiteMode=1) which disables many of the features perceived as "overkill" and makes it much more like an embedded DBMS -- but still hybrid at the core. When Lite mode is on:

  • Web services are not initialized, i.e., no web server, DAV, SOAP, POP3, etc.
  • replication is stopped
  • PL debugging is disabled
  • plugins are disabled
  • Bonjour/Rendezvous is disabled
  • tables relevant to the above are not created
  • index tree maps is set to 8 if no other setting is given
  • memory reserve is not allocated
  • DisableTcpSocket setting is treated as 1, regardless of value in INI file
  • I would love to use an open-source and free product instead

Virtuoso has two flavors -- commercial (VCE), and open source (VOS). Commercial includes shared-nothing elastic clustering which brings linear scalability, SPARQL GEO indexing and querying, result transformation to CXML for exploration with PivotViewer, and other features which VOS lacks ... but use the one that makes sense to you.

like image 34
TallTed Avatar answered Sep 28 '22 09:09

TallTed