Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

DISTINCT only on one value with SPARQL

I want to retrieve with SPARQL the list of the italian cities with more than 100k of population and I'm using the following query:

PREFIX dbo: <http://dbpedia.org/ontology/> 
SELECT ?city ?name ?pop WHERE { 
    ?city a dbo:Settlement .
    ?city foaf:name ?name .
    ?city dbo:populationTotal ?pop .
    ?city dbo:country ?country .
    ?city dbo:country dbpedia:Italy .
  FILTER (?pop > 100000) 
}

In the results I get for example in two different lines (which represent the same entity, but with different names):

http://dbpedia.org/resource/Bologna "Bologna"@en 384038

http://dbpedia.org/resource/Bologna "Comune di Bologna"@en 384038

How can I use SELECT DISTINCT only in the column ?city but still having as output the outher columns?

like image 434
drstein Avatar asked Mar 11 '15 14:03

drstein


People also ask

What is optional SPARQL?

OPTIONAL is a binary operator that combines two graph patterns. The optional pattern is any group pattern and may involve any SPARQL pattern types. If the group matches, the solution is extended, if not, the original solution is given (q-opt3. rq).

What is SPARQL prefix?

"PREFIX", however (without the "@"), is the SPARQL instruction for a declaration of a namespace prefix. It allows you to write prefixed names in queries instead of having to use full URIs everywhere. So it's a syntax convenience mechanism for shorter, easier to read (and write) queries.

What types of queries does SPARQL support?

SPARQL allows for a query to consist of triple patterns, conjunctions, disjunctions, and optional patterns. Implementations for multiple programming languages exist. There exist tools that allow one to connect and semi-automatically construct a SPARQL query for a SPARQL endpoint, for example ViziQuer.

What is construct query in SPARQL?

The CONSTRUCT query form returns an RDF graph. The graph is built based on a template which is used to generate RDF triples based on the results of matching the graph pattern of the query.


1 Answers

You can use GROUP BY to group by a specific column and then use the SAMPLE() aggregate to select one of the values from the other columns e.g.

PREFIX dbo: <http://dbpedia.org/ontology/> 

SELECT ?city (SAMPLE(?name) AS ?cityName) (SAMPLE(?pop) AS ?cityPop)
WHERE
{ 
    ?city a dbo:Settlement .
    ?city foaf:name ?name .
    ?city dbo:populationTotal ?pop .
    ?city dbo:country ?country .
    ?city dbo:country dbpedia:Italy .
    FILTER (?pop > 100000) 
}
GROUP BY ?city

So by grouping on the ?city you get only a single row per city, since you have grouped by ?city you can't directly select variables that aren't group variables.

You must instead use the SAMPLE() aggregate to pick one of the values for each of the non-group variables you wish to have in the final results. This will select one of the values of ?name and ?pop to return as ?cityName and ?cityPop respectively

like image 157
RobV Avatar answered Sep 27 '22 21:09

RobV