Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SPARQL query with COUNT and ORDER returns odd result

The following query counts all triples in a store

SELECT count(*) where { ?s ?p <http://dbpedia.org/resource/Cat> }

And returns the expected results

http://dbpedia.org/sparql?default-graph-uri=http://dbpedia.org&query=select+count(*)+{+%3Fs+%3Fp+%3Chttp://dbpedia.org/resource/Cat%3E+}+&debug=on&timeout=&format=text/html&save=display&fname=

However, when I first tried it I accidentally left in an ORDER BY statement, e.g.,

select count(*) { ?s ?p <http://dbpedia.org/resource/Cat> } order by ?s

Then I get a very long list of results

http://dbpedia.org/sparql?default-graph-uri=http://dbpedia.org&query=select+count(*)+{+%3Fs+%3Fp+%3Chttp://dbpedia.org/resource/Cat%3E+}+order+by+%3Fs&debug=on&timeout=&format=text/html&save=display&fname=

Can anyone explain why this result happens and what it means? Is it maybe a bug with the Virtuoso SPARQL implementation?

like image 582
John McCrae Avatar asked Apr 01 '11 19:04

John McCrae


People also ask

What is the result type of an SPARQL construct query?

SPARQL contains capabilities for querying required and optional graph patterns along with their conjunctions and disjunctions. SPARQL also supports extensible value testing and constraining queries by source RDF graph. The results of SPARQL queries can be results sets or RDF graphs.

How do you write queries in SPARQL?

Structure of a SPARQL Query A SPARQL query comprises, in order: Prefix declarations, for abbreviating URIs. Dataset definition, stating what RDF graph(s) are being queried. A result clause, identifying what information to return from the query.

What is the difference between SPARQL and SQL?

SQL does this by accessing tables in relational databases, and SPARQL does this by accessing a web of Linked Data. (Of course, SPARQL can be used to access relational data as well, but it was designed to merge disparate sources of data.)

What is optional SPARQL?

OPTIONAL is a binary operator that combines two graph patterns. The optional pattern is any group pattern and may involve any SPARQL pattern types. If the group matches, the solution is extended, if not, the original solution is given (q-opt3. rq).


3 Answers

It does look like a bug, if you run the same type of queries on a different store, i.e on http://api.talis.com/stores/bbc-backstage/services/sparql (which doesn't run virtuoso)

This first query works ...

SELECT (count(?s) as ?c)
WHERE {
?s ?p <http://purl.org/ontology/po/Version> .
}

and the second ...

SELECT (count(?s) as ?c)
WHERE {
?s ?p <http://purl.org/ontology/po/Version> .
} order by ?s

... gives the same result.

Actually counting + ordering doesn't make much sense here because ?s is not selected to be retrieved. But as you said, you tried it accidentally and ... it does look like a bug.

My recommendation is to send an email to the virtuoso-user mailing list to notify about this issue.

like image 105
Manuel Salvadores Avatar answered Oct 03 '22 15:10

Manuel Salvadores


We (= OpenLink) are in trouble here. This ORDER BY ?s is formally a bug in the query: an aggregate without grouping means "aggregate on everything", there should be no variables outside aggregates at the output end of the query. However this error is not reported now: violations of this rule are so numerous that SQL compiler makes an auto-grouping and our SPARQL-to-SQL preprocessor also ignores this error if possible.

We will probably keep the current behaviour as is. If a "strict" compilation mode is added it will trigger the error reporting in cases like this.

like image 24
Ivan Mikhailov Avatar answered Oct 03 '22 16:10

Ivan Mikhailov


This may be a bug with Virtuoso, it seems to treat queries with aggregates and an ORDER BY clause as having an implicit GROUP BY clause. I've noticed this on other Virtuoso endpoints besides the DBPedia one.

IMO this is a bug and you should report it to the Virutoso users mailing list as msalvadores suggests

like image 41
RobV Avatar answered Oct 03 '22 16:10

RobV