Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple counts in Sparql query

I would like to create a Sparql query that contains two counts.

The query should get the 'neighbours of neighbours' of A (A → B → C, where A is the start node), and should report for each C, how many paths there were from A to C, and how many "inlinks" there are to C from anywhere. The result set should be as follow:

C | #C |  C_INLINKS
--------------------------
A | 2  | 123
B | 3  | 234

Where #C is the number of paths to C from starting node A.

I can create the counts separately, but I don't know how to combine these:

Count neighbours of neighbours:

select ?c count(?c) as ?countc WHERE {
   <http://dbpedia.org/resource/AFC_Ajax> ?p1 ?b.
   ?b ?p2 ?c.
   FILTER (regex(str(?c), '^http://dbpedia.org/resource/'))
}
GROUP BY ?c
ORDER BY DESC(?countc)
LIMIT 100

Count inlinks to neighbours of neigbours

select ?c count(?inlink) as ?inlinks WHERE {
   <http://dbpedia.org/resource/AFC_Ajax> ?p1 ?b.
   ?b ?p2 ?c.
   ?inlink ?p3 ?c
   FILTER (regex(str(?c), '^http://dbpedia.org/resource/'))
}
GROUP BY ?c
ORDER BY DESC(?inlinks)
LIMIT 100

Is it possible to combine these two queries? Thank you!

like image 850
user1255553 Avatar asked Mar 20 '26 21:03

user1255553


1 Answers

The counts you're trying to extract require you to group by different things. group by lets you specify what you're trying to count with respect to. E.g., when you say, select (count(?x) as ?xn) {...} group by ?y, you're saying "how many ?x's appear per each value of ?y. The counts you're looking for are: "how many C's per A" and then "how many inlinks per C"? That means that in one case you'd need to group by ?a and in the other, you'd need to group by ?c. However, in this case, since you've got a fixed ?a, this might be a little bit easier. To count the distinct paths (?p1,?p2) is a little bit tricky, since when you do count(distinct …), you can only have a single expression for . However, you can be sneaky by counting distinct concat(str(?p1),str(?p2)), which is a single expression, and should be unique for each ?p1 ?p2 pair. Then I think you'd be looking for a query like this:

select ?c
       (count(distinct concat(str(?p1),str(?b),str(?p2))) as ?n_paths)
       (count(distinct ?inlink) as ?n_inlink)
where {
  dbpedia:AFC_Ajax ?p1 ?b . ?b ?p2 ?c .
  ?inlink ?p ?c
  filter strstarts(str(?c),str(dbpedia:))
}
group by ?c

SPARQL results

c                                                           n_paths n_inlink
----------------------------------------------------------------------------
http://dbpedia.org/resource/AFC_Ajax                        32      540
http://dbpedia.org/resource/Category:AFC_Ajax_players       17      484
http://dbpedia.org/resource/Category:Living_people          17      659447
http://dbpedia.org/resource/Category:Eredivisie_players     13      2232
http://dbpedia.org/resource/Category:Dutch_footballers      12      2141
http://dbpedia.org/resource/Category:1994_births             6      3605
…
like image 149
Joshua Taylor Avatar answered Mar 24 '26 11:03

Joshua Taylor



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!