Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can this SPARQL query be simplified?

Currently I'm implementing SPARQL queries for LDBC Benchmark. I came up with a solution to bi-read-3 query. The relevant part of data schema is the following: enter image description here

The query description:

Find the Tags that were used in Messages during the given month of the given year and the Tags that were used during the next month. For both months, compute the count of Messages that used each of the Tags.

My solution (with some syntax highlight):

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX sn: <http://www.ldbc.eu/ldbc_socialnet/1.0/data/>
PREFIX snvoc: <http://www.ldbc.eu/ldbc_socialnet/1.0/vocabulary/>
PREFIX sntag: <http://www.ldbc.eu/ldbc_socialnet/1.0/tag/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dbpedia: <http://dbpedia.org/resource/>
PREFIX dbpedia-owl: <http://dbpedia.org/ontology/>


SELECT ?tagName (SUM(?countMonth1) as ?countMonth1) (SUM(?countMonth2) as ?countMonth2) (ABS( SUM(?countMonth1) - SUM(?countMonth2) ) as ?diff)
WHERE
{ 
  {
    SELECT ?tagName (COUNT(?message) as ?innerCountMonth1)
    WHERE {

      BIND ( 2010 as ?year1 ) .
      BIND ( 9 as ?month1 ) .

      {
        ?message rdf:type snvoc:Comment 
      } UNION {
        ?message rdf:type snvoc:Post 
      } .
      ?message snvoc:creationDate ?creationDate .
      FILTER ((year(?creationDate) = ?year1 && month(?creationDate) = ?month1) )

      ?message snvoc:hasTag ?tag .
      ?tag foaf:name ?tagName .

    }
    GROUP BY ?tagName
  } UNION {
    SELECT ?tagName (COUNT(?message) as ?innerCountMonth2)
    WHERE {

      BIND ( 2010 as ?year1 ) .
      BIND ( 9 as ?month1 ) .
      BIND ( ?year1 + FLOOR(?month1 / 12.0) as ?year2 ) .
      BIND ( IF (?month1 = 12, 1, ?month1 + 1) as ?month2 ) .
      {
        ?message rdf:type snvoc:Comment 
      } UNION {
        ?message rdf:type snvoc:Post 
      } .
      ?message snvoc:creationDate ?creationDate .
      FILTER (year(?creationDate) = ?year2 && month(?creationDate) = ?month2 ) 

      ?message snvoc:hasTag ?tag .
      ?tag foaf:name ?tagName .

    }
    GROUP BY  ?tagName
  }

  BIND ( COALESCE(?innerCountMonth1, 0) as ?countMonth1 )
  BIND ( COALESCE(?innerCountMonth2, 0) as ?countMonth2 )
}
GROUP BY ?tagName
ORDER BY DESC(?diff) ?tagName

I have a feeling that there is a simpler solution, but I can't figure it out.

My question is that: This query can be implemented in a simpler/more effective way? E.g.: without nested queries or just a faster way.

I'm really new in SPARQL, so I will appreciate every useful comment or improvement.

like image 836
János Benjamin Antal Avatar asked Nov 07 '22 13:11

János Benjamin Antal


1 Answers

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX snvoc: <http://www.ldbc.eu/ldbc_socialnet/1.0/vocabulary/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT
?tagName
(SUM(xsd:integer(?isFirst)) AS ?countMonth1)
(SUM(xsd:integer(?isSecond)) AS ?countMonth2)
(ABS(?countMonth1 - ?countMonth2) AS ?diff) 
WHERE
  {
  VALUES (?year1 ?month1) {(2010 9)}
  VALUES (?type) {(snvoc:Comment) (snvoc:Post)}
  ?message a ?type; snvoc:hasTag/foaf:name ?tagName; snvoc:creationDate ?creationDate .
  BIND (year(?creationDate) AS ?year) .
  BIND (month(?creationDate) AS ?month) .
  BIND (IF (?month1 = 12, ?year1 + 1, ?year1     ) AS ?year2) .
  BIND (IF (?month1 = 12,          1, ?month1 + 1) AS ?month2) .
  BIND (((?month1 = ?month) && (?year1 = ?year)) AS ?isFirst) .
  BIND (((?month2 = ?month) && (?year2 = ?year)) AS ?isSecond) .
  FILTER (?isFirst || ?isSecond)
  }
  GROUP BY ?tagName HAVING (bound(?tagName))

Updates

See comments.

like image 55
Stanislav Kralin Avatar answered Nov 19 '22 13:11

Stanislav Kralin