Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Single quad + most basic SPARQL query = 1 result in Jena, 2 results in Sesame - who is right?

Add just this one quad to an empty store:

<http://x.com/s> <http://x.com/p> 2 <http://x.com/g> .

Then execute this SPARQL query (taken from per Bob DuCharme's book 'Learning SPARQL', so this must be standard SPARQL for retrieving all quads across the dataset, regardless of implementation, right!?):

SELECT ?g ?s ?p ?o
WHERE {
{ ?s ?p ?o }
UNION
{ GRAPH ?g { ?s ?p ?o } } }

But Jena and Sesame reply with different answers!!? Here's what I see:

Jena Fuseki console on Tomcat 6.0.37 (version 2.10.0 - out-of-the-box, no configuration changes!) - (the correct answer as I understand things):

--------------------------------------------------------------
| g                | s                | p                | o |
==============================================================
| <http://x.com/g> | <http://x.com/s> | <http://x.com/p> | 2 |
--------------------------------------------------------------

Sesame Workbench on Tomcat 6.0.37 (version 2.7.3 - out-of-the-box, no configuration changes!): Just used the 'Add' feature in workbench to manually add the above quad (with 'N-Quad' selected in the 'Data format' dropdown box), in the 'Enter the RDF data you wish to upload' edit box, then running the above query:

--------------------------------------------------------------
| g                | s                | p                | o |
==============================================================
|                  | <http://x.com/s> | <http://x.com/p> | 2 |
| <http://x.com/g> | <http://x.com/s> | <http://x.com/p> | 2 |
--------------------------------------------------------------

So this is kinda scary for someone starting to look at RDF - what am I missing here? I assume Sesame can't be 'wrong' - so it must be my 'interpretation' I suppose (or Bob's query isn't 'standard SPARQL', and so different implementations are free to return different results) - any enlightenment would be very welcome :) !

like image 817
Pat McBennett Avatar asked Aug 21 '13 19:08

Pat McBennett


People also ask

What types of queries does SPARQL support?

SPARQL allows for a query to consist of triple patterns, conjunctions, disjunctions, and optional patterns. Implementations for multiple programming languages exist. There exist tools that allow one to connect and semi-automatically construct a SPARQL query for a SPARQL endpoint, for example ViziQuer.

What is construct query in SPARQL?

The CONSTRUCT query form returns an RDF graph. The graph is built based on a template which is used to generate RDF triples based on the results of matching the graph pattern of the query.

Where is SPARQL used?

SPARQL can be used to express queries across diverse data sources, whether the data is stored natively as RDF or viewed as RDF via middleware. SPARQL contains capabilities for querying required and optional graph patterns along with their conjunctions and disjunctions.

What is optional SPARQL?

OPTIONAL is a binary operator that combines two graph patterns. The optional pattern is any group pattern and may involve any SPARQL pattern types. If the group matches, the solution is extended, if not, the original solution is given (q-opt3. rq).


1 Answers

As @Joshua Taylor points out in his comment, the cause is that Sesame and Jena use a different interpretation of default graph.

In Sesame, the entire repository is considered the default graph: all statements in all named graphs as well as all statements without a named graph. Therefore, the first argument of your union, which queries the default graph, succeeds and binds ?s, ?p and ?o (but not ?g). The second argument of your union obviously succeeds as well because the original quad is of course in a named graph, and therefore you get two answers.

Jena uses an "exclusive" default graph by default: only statements that are not explicitly added to any particular named graph are in the default graph. Therefore, in Jena, the first part of your union fails (there are no matching statements in Jena's default graph), the second part succeeds, and you therefore only get 1 result.

Strange as it may sound, both are correct. The difference is simply in how the dataset on which the query is executed is set up.

Of course, there are ways to deal with this. In both Jena and Sesame, you can add FROM (NAMED) clauses to make it explicit what the queried dataset is (Sesame offers the sesame:nil graph name to explicitly query those statements that have no named graph associated). Alternatively, you can programmatically modify the dataset definition on which a query is executed. The precise mechanisms in Jena and Sesame are a bit different, but they both have the option (in Sesame, you can create and supply a Dataset object with your query before executing, in Jena I believe you can reconfigure the actual store or model on which you execute the query to behave differently).

like image 99
Jeen Broekstra Avatar answered Oct 17 '22 14:10

Jeen Broekstra