I'm interested in downloading some boundary files from statistics.gov.scot, which is an official statistical repository for sharing statistical data that utilises SPARQL queries.
Statistics.gov.scot provides access to GeoJSON boundaries for number of administrative and statistical geographies, like local authority administrative boundaries or health boards. In my particular case I'm interested in download a data set with GeoJSON boundaries pertaining to data zones. Data zones are statistical geographies developed for the purpose of disseminating life outcomes data on a small area level. When accessed via the statistics.gov.scot sample data zone looks like that:
The geography and the related data can be accessed here. The corresponding GeoJSON data is available here.
Data zones are available in two iterations, on produced in 2004 and another one updated recently. I would like to download first iteration produced in 2004. Following the information on the statistical entities, I drafted the following query:
PREFIX entity: <http://statistics.data.gov.uk/def/statistical-entity#>
PREFIX boundaries: <http://statistics.gov.scot/boundaries/>
SELECT ?boundary
WHERE {
entity:introduced <http://reference.data.gov.uk/id/day/2004-02-01>
}
LIMIT 1000
which returns the following error message:
Error There was a syntax error in your query: Encountered " "}" "} "" at line 7, column 3. Was expecting one of: <IRIref> ... <PNAME_NS> ... <PNAME_LN> ... <BLANK_NODE_LABEL> ... <VAR1> ... <VAR2> ... "true" ... "false" ... <INTEGER> ... <DECIMAL> ... <DOUBLE> ... <INTEGER_POSITIVE> ... <DECIMAL_POSITIVE> ... <DOUBLE_POSITIVE> ... <INTEGER_NEGATIVE> ... <DECIMAL_NEGATIVE> ... <DOUBLE_NEGATIVE> ... <STRING_LITERAL1> ... <STRING_LITERAL2> ... <STRING_LITERAL_LONG1> ... <STRING_LITERAL_LONG2> ... "(" ... <NIL> ... "[" ... <ANON> ... "+" ... "*" ... "/" ... "|" ... "?" ...
when tested via the endpoint: http://statistics.gov.scot/sparql.
Ideally, I would like to develop other queries that would enable me to source other statistical geographies by using the entity:
prefix. This should be possible as the entity:
will contain information on the available geographies (name, acronym, date of creation).
The query:
PREFIX entity: <http://statistics.data.gov.uk/def/statistical-entity#>
PREFIX boundaries: <http://statistics.gov.scot/boundaries/>
SELECT DISTINCT ?boundary ?shape WHERE {
?shape entity:firstcode ?boundary
}
LIMIT 1000
Got me to something that looks like a list of desired geographies but I'm struggling to source the GeoJSON boundaries.
SPARQL contains capabilities for querying required and optional graph patterns along with their conjunctions and disjunctions. SPARQL also supports aggregation, subqueries, negation, creating values by expressions, extensible value testing, and constraining queries by source RDF graph.
A SPARQL Query Service is an HTTP Service (also known as a Web Service) that offers an API for performing declarative Data Definition and Data Manipulation operations on data represented as RDF sentence collections.
The first query is missing the subject. A SPARQL query defines a set of triple patterns - a subject, predicate, and object - to match an RDF graph. To turn your WHERE clause into a SPARQL triple pattern, try:
?boundary entity:introduced <http://reference.data.gov.uk/id/day/2004-02-01>
Neither statistics.gov.scot nor statistics.data.gov.uk contains data zones boundaries as WKT or string literals.
However, with the following query, one could easily construct URLs of the GeoJSON files that are used on resources' pages:
PREFIX pref1: <http://statistics.data.gov.uk/def/statistical-entity#>
PREFIX pref2: <http://statistics.gov.scot/id/statistical-entity/>
PREFIX pref3: <http://statistics.data.gov.uk/def/boundary-change/>
PREFIX pref4: <http://reference.data.gov.uk/id/day/>
PREFIX pref5: <http://statistics.data.gov.uk/def/statistical-geography#>
PREFIX pref6: <http://statistics.gov.scot/id/statistical-geography/>
PREFIX pref7: <http://statistics.gov.scot/boundaries/>
SELECT ?zone ?name ?json {
?zone pref1:code pref2:S01 .
?zone pref3:operativedate pref4:2004-02-01
OPTIONAL { ?zone pref5:officialname ?name }
BIND (CONCAT(REPLACE(STR(?zone), STR(pref6:), STR(pref7:)), ".json") AS ?json)
} ORDER BY (!bound(?name)) ASC(?name)
After that, one could easily retrieve GeoJSON files using wget -i
or something like this.
Some explanation
You should use <http://statistics.data.gov.uk/def/boundary-change/operativedate>
instead of <http://statistics.data.gov.uk/def/statistical-entity#introduced>
, the latter property is rather a class property:
SELECT * WHERE {
?S <http://statistics.data.gov.uk/def/statistical-entity#introduced> ?date .
?S <http://www.w3.org/2000/01/rdf-schema#label> ?label
}
The second generation data zones are dated by 2014-11-06
:
SELECT ?date (COUNT(?zone) AS ?count) WHERE {
?zone
<http://statistics.data.gov.uk/def/statistical-entity#code>
<http://statistics.gov.scot/id/statistical-entity/S01> ;
<http://statistics.data.gov.uk/def/boundary-change/operativedate>
?date
} GROUP BY ?date
Analogously, if you need URLs of corresponding GeoJSON files, your query should be:
SELECT ?zone ?name ?json {
?zone pref1:code pref2:S01 .
?zone pref3:operativedate pref4:2014-11-06 .
?zone pref5:officialname ?name
BIND (CONCAT(REPLACE(STR(?zone), STR(pref6:), STR(pref7:)), ".json") AS ?json)
} ORDER BY ASC(?name)
You do not need OPTIONAL
, because all second generation data zones have "official names".
Probably this page on data.gov.uk will be interesting for you.
There also exists opendata.stackexchange.com for questions related to open data.
Update
As of May 2018, one can retrieve data zones boundaries as WKT:
PREFIX pref1: <http://statistics.data.gov.uk/def/statistical-entity#>
PREFIX pref2: <http://statistics.gov.scot/id/statistical-entity/>
PREFIX pref3: <http://statistics.data.gov.uk/def/boundary-change/>
PREFIX pref4: <http://reference.data.gov.uk/id/day/>
PREFIX pref5: <http://statistics.data.gov.uk/def/statistical-geography#>
PREFIX pref6: <http://www.opengis.net/ont/geosparql#>
SELECT ?zone ?name ?geometry {
?zone pref1:code pref2:S01 .
?zone pref3:operativedate pref4:2014-11-06 .
?zone pref5:officialname ?name .
?zone pref6:hasGeometry/pref6:asWKT ?geometry .
} ORDER BY ASC(?name)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With