Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extract nodes that occur as both subject and object in a graph?

Tags:

rdf

sparql

I want to retrieve a list of nodes(vertices in the graph) that occur both in the subject and object part of a triple(not necessarily the same triple).

I tried doing it using a sub query as follows:

SELECT ?x
{

       ?x ?p ?o.

     {
         SELECT ?x  WHERE { ?s ?p ?x . }
     }
}

It is not giving me the exact results in the sense I am getting multiple instances of the node instance. And when I tried DISTINCT, it gave even more instances for some reason.

On a side note, if I wanted to extract the nodes that are subject OR object, how should I go about doing that?

Please excuse if there are any mistakes in the vocabulary used.

like image 696
user3451166 Avatar asked Mar 20 '23 07:03

user3451166


1 Answers

Nodes that are subjects and objects

Short and sweet

Just ask for something that appears as a subject and an object:

select distinct ?x {
  ?s1 ?p1  ?x .
   ?x ?p2 ?o2 .
}

Making it illegible (just for fun)

If you want to make that a bit shorter, but much less readable, you can use something like

prefix : <...anything...>

select distinct ?x {
  ?x (:|!:) ?o ; ^(:|!:) ?s .
}

The pattern (:|!:) matches any property that is either : or not :. That means it matches everything; it's just a wildcard. (You could just use ?p which is essentially a wildcard, too, but keep reading…) The path ^p means p, but in the reverse direction (so, e.g., ?person foaf:name ?name and ?name ^foaf:name ?person match the same data. Since (:|!:) is a wildcard, ^(:|!:) is a wildcard in the reverse direction. We can't use variables in property paths, so even though ?p is a "forward wildcard", we can't use ^?p as a "backward wildcard". The ; notation just lets you abbreviate, e.g., ?x :p2 :o1 and ?x :p2 :o2 as ?x :p1 :o1 ; :p2 :o2. Using it here, we can get:

?x  (:|!:) ?o ;    # every ?x that is a subject
   ^(:|!:) ?s .    # every ?x that is an object

Removing comments and linebreaks, we get

?x (:|!:) ?o ; ^(:|!:) ?s .

You should probably use the readable one. :)

Nodes that are subjects or objects

This was already answered in your previous question about computing node degree, How to calculate maximum degree of a directed graph using SPARQL?. The answer there used this query to compute degree:

select ?x (count(*) as ?degree) { 
  { ?x ?p ?o } union
  { ?s ?p ?x }
}
group by ?x

It can find nodes that are subjects or objects, too, though. Just change it to:

select distinct ?x  { 
  { ?x ?p ?o } union
  { ?s ?p ?x }
}

Alternatively, you could use a wildcard approach here, too:

select distinct ?x {
  ?x (:|!:)|^(:|!:) [].
}
like image 185
Joshua Taylor Avatar answered May 01 '23 20:05

Joshua Taylor