Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing SPARQL queries

I need to test for a certain structural property of a couple million SPARQL queries, and for that I need the structure of the WHERE statement. I'm currently trying to use fyzz to do this, but unfortunately its documentation is not very useful. Parsing queries is easy, the problem is that i haven't been able to recover the structure of the statement. For example:

>>> from fyzz import parse
>>> a=parse("SELECT * WHERE {?x a ?y . {?x a ?z}}")
>>> b=parse("SELECT * WHERE {?x a ?y OPTIONAL {?x a ?z}}")
>>> a.where==b.where
True
>>> a.where
[(SparqlVar('x'), ('', 'a'), SparqlVar('y')), (SparqlVar('x'), ('', 'a'), SparqlVar('y'))]

Is there a way to recover the actual parse tree in fyzz instead of just the triples, or some other tool which would let me do this? RDFLib seems to have had a bison SPARQL parser in the past, but I can't find it in the rdflib or rdfextras.sparql packages.

Thanks

like image 208
ailnlv Avatar asked Aug 08 '11 16:08

ailnlv


3 Answers

Another tool is roqet a tool that is packaged within rasqal. It is a command line tool that returns the parsed tree. For instance:

roqet -i laqrs -d structure -n -e "SELECT * WHERE {?x a ?y OPTIONAL {?x a ?z}}"

would output ..

Query:
query verb: SELECT
query bound variables (3): x, y, z
query Group graph pattern[0] {
  sub-graph patterns (2) {
    Basic graph pattern[1] #0 {
      triples {
        triple #0 { triple(variable(x), uri<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>, variable(y)) }
      }
    }
    Optional graph pattern[2] #1 {
      sub-graph patterns (1) {
        Basic graph pattern[3] #0 {
          triples {
            triple #0 { triple(variable(x), uri<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>, variable(z)) }
          }
        }
      }
    }
  }
}

Looking at your comment in the other answer I don't think this is what yo need. And I don't think you will find an answer looking inside SPARQL parsers. The object (or triple pattern) evaluation in a query happens inside Query Engines that, in well designed systems, is isolated from query parsing.

For instance, in 4store you could look at the 4s-query command with the option -vvv (very verbose) where you would see an output of how the query was executed and how substitutions were performed for each triple pattern evaluation.

like image 168
Manuel Salvadores Avatar answered Nov 03 '22 19:11

Manuel Salvadores


ANTLR has a SPARQL grammar here: http://www.antlr.org/grammar/1200929755392/index.html

ANTLR can generate parsing code for Python to run.

like image 3
Ned Batchelder Avatar answered Nov 03 '22 20:11

Ned Batchelder


Try using rdflib.plugins.sparql.parser.parseQuery.

like image 2
cnstntn.kndrtv Avatar answered Nov 03 '22 20:11

cnstntn.kndrtv