Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to query for the number of distinct tuples using SPARQL 1.1?

It seems to be possible to count a single entity using

(COUNT(DISTINCT ?x) as ?count)

and for the number of distinct tuples for all variables in the query using

(COUNT(DISTINCT *) as ?count)

However, I cannot figure out how to count specific (distinct) tuples. Something like

(COUNT(DISTINCT ?a ?b ?c) as ?count) 

does not seem to work. Am I doing it wrong or is this really not allowed in SPARQL 1.1? Or is it supposed to work and just not supported in Sesame 2.6.0 which I am using for testing this?

like image 579
jpp1 Avatar asked May 08 '12 15:05

jpp1


People also ask

How do I query using SPARQL?

A SPARQL query may specify the dataset to be used for matching by using the FROM clause and the FROM NAMED clause to describe the RDF dataset. If a query provides such a dataset description, then it is used in place of any dataset that the query service would use if no dataset description is provided in a query.

How does SPARQL work?

SPARQL sees your data as a directed, labeled graph, that is internally expressed as triples consisting of subject, predicate and object. Correspondingly, a SPARQL query consists of a set of triple patterns in which each element (the subject, predicate and object) can be a variable (wildcard).

What types of queries does SPARQL support?

SPARQL contains capabilities for querying required and optional graph patterns along with their conjunctions and disjunctions. SPARQL also supports aggregation, subqueries, negation, creating values by expressions, extensible value testing, and constraining queries by source RDF graph.

What is prefix in SPARQL?

"PREFIX", however (without the "@"), is the SPARQL instruction for a declaration of a namespace prefix. It allows you to write prefixed names in queries instead of having to use full URIs everywhere. So it's a syntax convenience mechanism for shorter, easier to read (and write) queries.


1 Answers

Welcome to StackOverflow!

Make sure that your intermediate result only contains the three variables ?a ?b ?c that you're interested in.

One way of doing this is to use a subquery. The subquery projects only the three desired variables. Something like this:

SELECT (COUNT(*) AS ?count) {
   SELECT DISTINCT ?a ?b ?c {
      …
   }
}

(I'm not sure whether Sesame supports subqueries.)

Another way is to simply make sure that your query only contains the three variables. If you need more variables inside the query, you may be able to replace them with blank nodes. Blank nodes in SPARQL graph patterns work like “anonymous variables”. There are some funny scoping issues with this though, so the subquery approach is probably better.

like image 99
cygri Avatar answered Nov 16 '22 03:11

cygri