Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple aggregates in SPARQL

I have a triple store that contains mail archive data. So let's say I have a lot of persons (foaf:Person) that have sent (ex:hasSent) and received (ex:hasReceived) emails (ex:Email).

Example:

SELECT ?person ?email
WHERE {
    ?email  rdf:type   ex:Email.
    ?person rdf:type   foaf:Person;
            ex:hasSent ?email.
}

The same works for ex:hasReceived, of course. Now I would like to do some statistics and analytics, i.e. determine how many emails an individual has sent and received. Doing this for only one predicate is a simple aggregation:

SELECT ?person (COUNT(?email) AS ?count)
WHERE {
    ?email  rdf:type   ex:Email.
    ?person rdf:type   foaf:Person;
            ex:hasSent ?email.
}
GROUP BY ?person

However, I need need the number of received emails as well and I would like to do this without having to issue a separate query. So I tried the following:

SELECT ?person (COUNT(?email1) AS ?sent_emails) (COUNT(?email2) AS ?received_emails)
WHERE {
  ?person rdf:type foaf:Person.

  ?sent_email rdf:type ex:Email.
  ?person ex:hasSent ?sent_email.

  ?received_email rdf:type ex:Email.
  ?person ex:hasReceived ?received_email.
}
GROUP BY ?person

This did not seem to be right, as the numbers for the emails sent vs. received were exactly the same. I assume this is because my SPARQL statement results in a cross product of all mails a person has ever sent and received, right?

What do I need to do in order to get the statistics right on a per-individual basis?

like image 342
cyroxx Avatar asked Mar 04 '26 18:03

cyroxx


1 Answers

COUNT(?email1) isn't counting anything as ?email1 is undefined. Also, there is partial cross product as you mention - DISTINCT will help.

Try (COUNT(DISTINCT ?sent_email) AS ?sent_emails)

like image 80
AndyS Avatar answered Mar 08 '26 22:03

AndyS