I want to retrieve with SPARQL the list of the italian cities with more than 100k of population and I'm using the following query:
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?city ?name ?pop WHERE {
?city a dbo:Settlement .
?city foaf:name ?name .
?city dbo:populationTotal ?pop .
?city dbo:country ?country .
?city dbo:country dbpedia:Italy .
FILTER (?pop > 100000)
}
In the results I get for example in two different lines (which represent the same entity, but with different names):
http://dbpedia.org/resource/Bologna "Bologna"@en 384038
http://dbpedia.org/resource/Bologna "Comune di Bologna"@en 384038
How can I use SELECT DISTINCT
only in the column ?city
but still having as output the outher columns?
OPTIONAL is a binary operator that combines two graph patterns. The optional pattern is any group pattern and may involve any SPARQL pattern types. If the group matches, the solution is extended, if not, the original solution is given (q-opt3. rq).
"PREFIX", however (without the "@"), is the SPARQL instruction for a declaration of a namespace prefix. It allows you to write prefixed names in queries instead of having to use full URIs everywhere. So it's a syntax convenience mechanism for shorter, easier to read (and write) queries.
SPARQL allows for a query to consist of triple patterns, conjunctions, disjunctions, and optional patterns. Implementations for multiple programming languages exist. There exist tools that allow one to connect and semi-automatically construct a SPARQL query for a SPARQL endpoint, for example ViziQuer.
The CONSTRUCT query form returns an RDF graph. The graph is built based on a template which is used to generate RDF triples based on the results of matching the graph pattern of the query.
You can use GROUP BY
to group by a specific column and then use the SAMPLE()
aggregate to select one of the values from the other columns e.g.
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?city (SAMPLE(?name) AS ?cityName) (SAMPLE(?pop) AS ?cityPop)
WHERE
{
?city a dbo:Settlement .
?city foaf:name ?name .
?city dbo:populationTotal ?pop .
?city dbo:country ?country .
?city dbo:country dbpedia:Italy .
FILTER (?pop > 100000)
}
GROUP BY ?city
So by grouping on the ?city
you get only a single row per city, since you have grouped by ?city
you can't directly select variables that aren't group variables.
You must instead use the SAMPLE()
aggregate to pick one of the values for each of the non-group variables you wish to have in the final results. This will select one of the values of ?name
and ?pop
to return as ?cityName
and ?cityPop
respectively
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With