Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SPARQL property path queries with arbitrary properties

SPARQL property path queries of arbitrary length require using specific properties. I want to query and find any path starting from a resource and ending in another resource. For example:

SELECT ?p
WHERE { :startNode ?p* :endNode }

where ?p* specifies a path. Is there a way of doing this?

like image 385
user3287385 Avatar asked Nov 02 '14 11:11

user3287385


1 Answers

You're right that you can't use variables in property path expressions. There are a few things that you can do, though, that might help you.

A wildcard to check whether a path exists

You can use a wildcard by taking the disjunction of it and its negation, so you can do a simple query that checks whether there is a path connecting two resources:

<source> (<>|!<>)* <target>

If you have a : prefix defined, that can be even shorter, since : is a valid IRI:

<source> (:|!:)* <target>

If there is a path (or multiple paths) between two nodes, you can split it up using wildcard paths joined by ?p, and so find all the ?ps that are on the path:

<source> (:|!:)* ?x .
?x ?p ?y .
?y (:|!:)* <target> .

You can make that even shorter, I think, by using blank nodes instead of ?x and ?y:

<source> (:|!:)* [ ?p [ (:|!:)* <target> ] ]

(That might not work, though. I seem to recall the grammar actually disallowing property paths in some places within blank nodes. I'm not sure.)

For a single path, get properties and positions, then group_concat

Now, in the case that there is just one path between two resources, you can even get the properties along that path, along with their positions. You could order by those positions, and then use a group by to concatenate the properties in order into a single string. This is probably easiest to see with an example. Suppose you've got the following data which has a single path from :a to :d:

@prefix : <urn:ex:> .

:a :p1 :b .
:b :p2 :c .
:c :p3 :d .

Then you can use a query like this to get each property in the path and its position. (This only works if there's a single path, though. See my answer to Is it possible to get the position of an element in an RDF Collection in SPARQL? for a bit more about how this works.)

prefix : <urn:ex:>

select ?p (count(?mid) as ?pos) where {
  :a (:|!:)* ?mid .
  ?mid (:|!:)* ?x .
  ?x ?p ?y. 
  ?y (:|!:)* :d
}
group by ?x ?p ?y
-------------
| p   | pos |
=============
| :p2 | 2   |
| :p1 | 1   |
| :p3 | 3   |
-------------

Now, if you order those results by ?pos and wrap that query in another, then you can use group_concat on ?p to get a single string of the properties in order. (The order being preserved isn't guaranteed, but it's pretty common behavior. See my answer to obtain the matrix in protege for another example of how this technique works, and my answer to Ordering in GROUP_CONCAT in SPARQL 1.1 for discussion about why it is not guaranteed.)

prefix : <urn:ex:>

select (group_concat(concat('<',str(?p),'>');separator=' ') as ?path) {
  select ?p (count(?mid) as ?pos) where {
    :a (:|!:)* ?mid .
    ?mid (:|!:)* ?x .
    ?x ?p ?y. 
    ?y (:|!:)* :d
  }
  group by ?x ?p ?y
  order by ?pos
}
-----------------------------------------
| path                                  |
=========================================
| "<urn:ex:p1> <urn:ex:p2> <urn:ex:p3>" |
-----------------------------------------
like image 136
Joshua Taylor Avatar answered Nov 07 '22 07:11

Joshua Taylor