Currently I'm implementing SPARQL queries for LDBC Benchmark. I came up with a solution to bi-read-3 query. The relevant part of data schema is the following:
The query description:
Find the Tags that were used in Messages during the given month of the given year and the Tags that were used during the next month. For both months, compute the count of Messages that used each of the Tags.
My solution (with some syntax highlight):
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX sn: <http://www.ldbc.eu/ldbc_socialnet/1.0/data/>
PREFIX snvoc: <http://www.ldbc.eu/ldbc_socialnet/1.0/vocabulary/>
PREFIX sntag: <http://www.ldbc.eu/ldbc_socialnet/1.0/tag/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dbpedia: <http://dbpedia.org/resource/>
PREFIX dbpedia-owl: <http://dbpedia.org/ontology/>
SELECT ?tagName (SUM(?countMonth1) as ?countMonth1) (SUM(?countMonth2) as ?countMonth2) (ABS( SUM(?countMonth1) - SUM(?countMonth2) ) as ?diff)
WHERE
{
{
SELECT ?tagName (COUNT(?message) as ?innerCountMonth1)
WHERE {
BIND ( 2010 as ?year1 ) .
BIND ( 9 as ?month1 ) .
{
?message rdf:type snvoc:Comment
} UNION {
?message rdf:type snvoc:Post
} .
?message snvoc:creationDate ?creationDate .
FILTER ((year(?creationDate) = ?year1 && month(?creationDate) = ?month1) )
?message snvoc:hasTag ?tag .
?tag foaf:name ?tagName .
}
GROUP BY ?tagName
} UNION {
SELECT ?tagName (COUNT(?message) as ?innerCountMonth2)
WHERE {
BIND ( 2010 as ?year1 ) .
BIND ( 9 as ?month1 ) .
BIND ( ?year1 + FLOOR(?month1 / 12.0) as ?year2 ) .
BIND ( IF (?month1 = 12, 1, ?month1 + 1) as ?month2 ) .
{
?message rdf:type snvoc:Comment
} UNION {
?message rdf:type snvoc:Post
} .
?message snvoc:creationDate ?creationDate .
FILTER (year(?creationDate) = ?year2 && month(?creationDate) = ?month2 )
?message snvoc:hasTag ?tag .
?tag foaf:name ?tagName .
}
GROUP BY ?tagName
}
BIND ( COALESCE(?innerCountMonth1, 0) as ?countMonth1 )
BIND ( COALESCE(?innerCountMonth2, 0) as ?countMonth2 )
}
GROUP BY ?tagName
ORDER BY DESC(?diff) ?tagName
I have a feeling that there is a simpler solution, but I can't figure it out.
My question is that: This query can be implemented in a simpler/more effective way? E.g.: without nested queries or just a faster way.
I'm really new in SPARQL, so I will appreciate every useful comment or improvement.
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX snvoc: <http://www.ldbc.eu/ldbc_socialnet/1.0/vocabulary/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT
?tagName
(SUM(xsd:integer(?isFirst)) AS ?countMonth1)
(SUM(xsd:integer(?isSecond)) AS ?countMonth2)
(ABS(?countMonth1 - ?countMonth2) AS ?diff)
WHERE
{
VALUES (?year1 ?month1) {(2010 9)}
VALUES (?type) {(snvoc:Comment) (snvoc:Post)}
?message a ?type; snvoc:hasTag/foaf:name ?tagName; snvoc:creationDate ?creationDate .
BIND (year(?creationDate) AS ?year) .
BIND (month(?creationDate) AS ?month) .
BIND (IF (?month1 = 12, ?year1 + 1, ?year1 ) AS ?year2) .
BIND (IF (?month1 = 12, 1, ?month1 + 1) AS ?month2) .
BIND (((?month1 = ?month) && (?year1 = ?year)) AS ?isFirst) .
BIND (((?month2 = ?month) && (?year2 = ?year)) AS ?isSecond) .
FILTER (?isFirst || ?isSecond)
}
GROUP BY ?tagName HAVING (bound(?tagName))
Updates
See comments.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With