Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to get the position of an element in an RDF Collection in SPARQL?

Tags:

Suppose that I have the following Turtle declaration:

@prefix : <http://example.org#> .  :ls :list (:a :b :c) 

Is there a way to get the positions of the elements in the collection?

For example, with this query:

PREFIX :     <http://example.org#> PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>   SELECT ?elem WHERE {  ?x :list ?ls .  ?ls rdf:rest*/rdf:first ?elem . } 

I get:

-------- | elem | ======== | :a   | | :b   | | :c   | -------- 

But I would like a query to obtain:

-------------- | elem | pos | ============== | :a   |  0  | | :b   |  1  | | :c   |  2  | -------------- 

Is it possible?

like image 708
Labra Avatar asked Jul 08 '13 09:07

Labra


People also ask

What types of queries does SPARQL support?

SPARQL contains capabilities for querying required and optional graph patterns along with their conjunctions and disjunctions. SPARQL also supports aggregation, subqueries, negation, creating values by expressions, extensible value testing, and constraining queries by source RDF graph.

What is regex in SPARQL?

SPARQL FILTER functions like regex can test RDF literals. regex matches only plain literals with no language tag. regex can be used to match the lexical forms of other literals by using the str function.

WHAT IS A in Sparql query?

It's a SPARQL 1.1 property path which describes a route through a graph between two graph nodes, in your case it denotes the inverse path, i.e. from object to subject, thus, it's equivalent to. dbpedia:Stephen_King a ? subtype . with a being just a shortcut for rdf:type.


2 Answers

I have found a way to do it using the property function library in ARQ. As Steve Harris says, this is non-standard.

PREFIX :     <http://example.org#> PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  PREFIX list: <http://jena.hpl.hp.com/ARQ/list#>  SELECT ?elem ?pos WHERE {  ?x :list ?ls .  ?ls list:index (?pos ?elem). } 
like image 26
Labra Avatar answered Oct 26 '22 09:10

Labra


A Pure SPARQL 1.1 Solution

I've extended the data to make the problem a little harder. Let's add a duplicate element to the list, e.g., an additional :a at the end:

@prefix : <http://example.org#> .  :ls :list (:a :b :c :a) . 

Then we can use a query like this to extract each list node (and its element) along with the position of the node in the list. The idea is that we can match all the individual nodes in the list with a pattern like [] :list/rdf:rest* ?node. The position of each node, though, is the number of intermediate nodes between the head of the list and ?node. We can match each of those intermediate nodes by breaking the pattern down into

[] :list/rdf:rest* ?mid . ?mid rdf:rest* :node . 

Then if we group by ?node, the number of distinct ?mid bindings is the position of ?node in the list. Thus we can use the following query (which also grabs the element (the rdf:first) associated with each node) to get the positions of elements in the list:

prefix : <https://stackoverflow.com/q/17523804/1281433/> prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  select ?element (count(?mid)-1 as ?position) where {    [] :list/rdf:rest* ?mid . ?mid rdf:rest* ?node .   ?node rdf:first ?element . } group by ?node ?element 
---------------------- | element | position | ====================== | :a      | 0        | | :b      | 1        | | :c      | 2        | | :a      | 3        | ---------------------- 

This works because the structure of an RDF list is a linked list like this (where ?head is the beginning of the list (the object of :list), and is another binding of ?mid because of the pattern [] :list/rdf:rest* ?mid):

graphical representation of RDF list

Comparison with Jena ARQ Extensions

The asker of the question also posted an answer that uses Jena's ARQ extensions for working with RDF lists. The solution posted in that answer is

PREFIX :     <http://example.org#> PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  PREFIX list: <http://jena.hpl.hp.com/ARQ/list#>  SELECT ?elem ?pos WHERE {  ?x :list ?ls .  ?ls list:index (?pos ?elem). } 

This answer depends on using Jena's ARQ and enabling the extensions, but it is more concise and transparent. What isn't obvious is whether one has an obviously preferable performance. As it turns out, for small lists, the difference isn't particularly significant, but for larger lists, the ARQ extensions have much better performance. The runtime for the pure SPARQL query quickly becomes prohibitively long, while there's almost no difference in the version using the ARQ extensions.

------------------------------------------- | num elements | pure SPARQL | list:index | =========================================== |      50      |    1.1s     |    0.8s    | |     100      |    1.5s     |    0.8s    | |     150      |    2.5s     |    0.8s    | |     200      |    4.8s     |    0.8s    | |     250      |    9.7s     |    0.8s    | ------------------------------------------- 

These specific values will obviously differ depending on your setup, but the general trend should be observable anywhere. Since things could change in the future, here's the particular version of ARQ I'm using:

$ arq --version Jena:       VERSION: 2.10.0 Jena:       BUILD_DATE: 2013-02-20T12:04:26+0000 ARQ:        VERSION: 2.10.0 ARQ:        BUILD_DATE: 2013-02-20T12:04:26+0000 

As such, if I knew that I had to process lists of non-trivial sizes and that I had ARQ available, I'd use the extension.

like image 79
Joshua Taylor Avatar answered Oct 26 '22 08:10

Joshua Taylor