Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to run IN and NOT IN SPARQL statements in python rdflib to remove the intersection of two graphs

I'm trying to use the IN and NOT IN statements (that were if I understand correctly, introduced in SPARQL 1.1) on the python implementation of SPARQL (now in rdfextras) but it seems that the syntax is not recognised.

Let's consider two sets (A and B). I want to output what is in Set A, removing what is in Set B.

SELECT ?title WHERE {
   some logic defining ?item and ?SetB
   FILTER (?item NOT IN ?SetB)
}

Maybe this particular thing was added in SPARQL 1.1 and is not supported by rdfextra, in which case I would love to have a workaround (or how to do it without using the NOT IN keyword)

like image 597
Alexis Métaireau Avatar asked Apr 25 '11 15:04

Alexis Métaireau


2 Answers

I have tried a similar query and also got a parsing exception. I have gone through rdflib's SPARQL parser code and it doesn't seem to exist a rule to handle IN or NOT IN. I would assume that this functionality is not implemented.

Anyway, I am not sure you are using it correctly. Having a look at the NOT IN definition in the SPARQL 1.1 spec ... it defines the IN operator to be used against a list of expressions. Therefore, you'd do:

FILTER (?item NOT IN (?SetB))

And I am not completely sure if you can use variables in the right-hand side because all the examples in the spec use terms. edit: see RobV message , it's possible to use variables in the RLH

workaround with one query

One possible solution, that might work for you is to use OPTIONAL and bound (both supported in rdflib). Something like ...

SELECT ?title WHERE {
   some logic defining ?item
   OPTIONAL {
   some logic defining ?SetB
   }
   FILTER (bound(?SetB) && ?setB != ?item)
}

Without knowing more about your query I cannot advice better with this case.

workaround with two queries

The easiest way to solve this with rdlib is to use filters and two queries, the first query retrieves all the posible values for ?SetB. Ant in the second query you dynamically create a filter:

SELECT ?title WHERE {
   some logic defining ?item
   FILTER (?item != <setb_val1> && ?item != <setb_val2> &&
   ... && ?item != <setb_val2>)
}
like image 167
Manuel Salvadores Avatar answered Sep 19 '22 22:09

Manuel Salvadores


Difficult to answer without specifics, but it sounds like you want MINUS:

SELECT ?title WHERE {
    ?item ... ITEM CRITERIA ...
    MINUS { ?item ... SET CRITERIA ... }
}

for example:

SELECT ?title WHERE {
    ?item ex:colour "red" .       # item is red
    MINUS { ?item ex:size "big" } # but not in set of big things
}

NOT IN is a bit misleading: as far as I can tell it operates over a list expression, not a list you can define.

like image 21
user205512 Avatar answered Sep 21 '22 22:09

user205512