The following query counts all triples in a store
SELECT count(*) where { ?s ?p <http://dbpedia.org/resource/Cat> }
And returns the expected results
http://dbpedia.org/sparql?default-graph-uri=http://dbpedia.org&query=select+count(*)+{+%3Fs+%3Fp+%3Chttp://dbpedia.org/resource/Cat%3E+}+&debug=on&timeout=&format=text/html&save=display&fname=
However, when I first tried it I accidentally left in an ORDER BY statement, e.g.,
select count(*) { ?s ?p <http://dbpedia.org/resource/Cat> } order by ?s
Then I get a very long list of results
http://dbpedia.org/sparql?default-graph-uri=http://dbpedia.org&query=select+count(*)+{+%3Fs+%3Fp+%3Chttp://dbpedia.org/resource/Cat%3E+}+order+by+%3Fs&debug=on&timeout=&format=text/html&save=display&fname=
Can anyone explain why this result happens and what it means? Is it maybe a bug with the Virtuoso SPARQL implementation?
SPARQL contains capabilities for querying required and optional graph patterns along with their conjunctions and disjunctions. SPARQL also supports extensible value testing and constraining queries by source RDF graph. The results of SPARQL queries can be results sets or RDF graphs.
Structure of a SPARQL Query A SPARQL query comprises, in order: Prefix declarations, for abbreviating URIs. Dataset definition, stating what RDF graph(s) are being queried. A result clause, identifying what information to return from the query.
SQL does this by accessing tables in relational databases, and SPARQL does this by accessing a web of Linked Data. (Of course, SPARQL can be used to access relational data as well, but it was designed to merge disparate sources of data.)
OPTIONAL is a binary operator that combines two graph patterns. The optional pattern is any group pattern and may involve any SPARQL pattern types. If the group matches, the solution is extended, if not, the original solution is given (q-opt3. rq).
It does look like a bug, if you run the same type of queries on a different store, i.e on http://api.talis.com/stores/bbc-backstage/services/sparql (which doesn't run virtuoso)
This first query works ...
SELECT (count(?s) as ?c)
WHERE {
?s ?p <http://purl.org/ontology/po/Version> .
}
and the second ...
SELECT (count(?s) as ?c)
WHERE {
?s ?p <http://purl.org/ontology/po/Version> .
} order by ?s
... gives the same result.
Actually counting + ordering doesn't make much sense here because ?s
is not selected to be retrieved. But as you said, you tried it accidentally and ... it does look like a bug.
My recommendation is to send an email to the virtuoso-user mailing list to notify about this issue.
We (= OpenLink) are in trouble here. This ORDER BY ?s is formally a bug in the query: an aggregate without grouping means "aggregate on everything", there should be no variables outside aggregates at the output end of the query. However this error is not reported now: violations of this rule are so numerous that SQL compiler makes an auto-grouping and our SPARQL-to-SQL preprocessor also ignores this error if possible.
We will probably keep the current behaviour as is. If a "strict" compilation mode is added it will trigger the error reporting in cases like this.
This may be a bug with Virtuoso, it seems to treat queries with aggregates and an ORDER BY
clause as having an implicit GROUP BY
clause. I've noticed this on other Virtuoso endpoints besides the DBPedia one.
IMO this is a bug and you should report it to the Virutoso users mailing list as msalvadores suggests
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With