Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are there threaded SPARQL implementations?

Tags:

rdf

sparql

We are building an iterative algorithm using a set of SPARQL queries for each iteration. This algorithm works great, but we're running into a CPU utilization issue. SPARQL engines like Fuseki are not truly multithreaded; they allow multiple simultaneous queries to be executed in multiple threads, but each individual query is single threaded. From looking at some Fuseki notes, I get the impression that Fuseki is not thread safe so this is not a trivial issue.

Since our algorithm is inherently serial in terms of the SPARQL queries, and we are interested in one run at a time, is there some SPARQL engine that can take advantage of, say, 32 cores?

like image 361
Adam Avatar asked Feb 07 '13 01:02

Adam


2 Answers

Yes there are, BigData is a open source/commercial example of this.

My own project dotNetRDF also uses multi-threaded heavily, in my case I levarage the .Net PLINQ feature to parallelize joins, products, FILTER and BIND operations though they aren't always amenable to this.

On the note of Fuseki (Disclaimer I am a also involved in the Apache Jena project) as AndyS points out Fuseki itself is thread safe. The issue is that the query engine (ARQ) is not designed to parallelize operations, some ideas about this have been discussed in the past but IMO it would involve a fairly significant rewrite.

like image 78
RobV Avatar answered Oct 14 '22 00:10

RobV


The Urika engine developed and marketed by YarcData is highly multithreaded (up to several thousand simultaneous threads) and runs in very large memory. Probably not suitable for a hobbyist budget though. :)

like image 26
spreinhardt Avatar answered Oct 13 '22 23:10

spreinhardt