Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SPARQL query on the remote remote endpoint RDFLib / Redland

I'm trying to query remote endpoints and get get owl:sameAs mappings, I've tried both RDFLib and Redland but neither worked for me, probably I'm not dealing with namespaces correctly.

Here is my attempt in RDFLib:

    import rdflib

    rdflib.plugin.register('sparql', rdflib.query.Processor, 'rdfextras.sparql.processor', 'Processor')
    rdflib.plugin.register('sparql', rdflib.query.Result, 'rdfextras.sparql.query', 'SPARQLQueryResult')

    g = rdflib.Graph()

    query = """
        SELECT *
        FROM <http://api.talis.com/stores/bbc-backstage/services/sparql>
        WHERE {
             ?s a http://purl.org/ontology/mo/MusicArtist;
                http://www.w3.org/2002/07/owl#sameAs ?o .
        }Limit 50
    """

    for row in g.query(query):
        print row

And here is Redland:

import RDF
model = RDF.Model()

query = """
    SELECT *
    FROM <http://api.talis.com/stores/bbc-backstage/services/sparql>
    WHERE {
         ?s a http://purl.org/ontology/mo/MusicArtist;
            http://www.w3.org/2002/07/owl#sameAs ?o .
    }Limit 50
"""

for statement in RDF.Query(query ,query_language="sparql").execute(model):
    print statement

Can you please give a hint what is wrong in any one of those? Yet another difficulty I have: Is it possible to get dataset name of the object? For example: if there is:

?s = http://www.bbc.co.uk/music/artists/eb5c8564-927d-414d-b152-c7b48a2c9d8b#artist
predicate = http://www.w3.org/2002/07/owl#sameAs
?0 = http://dbpedia.org/resource/The_Boy_Least_Likely_To

Can I get name of the "Dbpedia" in this example? Or any other dataset to which I'm having sameAs link? (Or probably I could just look-up interested dataset names in the object string) thank you very VERY much in advance

like image 612
user52028778 Avatar asked May 04 '11 18:05

user52028778


People also ask

How do you write queries in Sparql?

Structure of a SPARQL Query A SPARQL query comprises, in order: Prefix declarations, for abbreviating URIs. Dataset definition, stating what RDF graph(s) are being queried. A result clause, identifying what information to return from the query.

What are Sparql endpoints?

A SPARQL Endpoint is a Point of Presence on an HTTP network that's capable of receiving and processing SPARQL Protocol requests. It is identified by a URL commonly referred to as a SPARQL Endpoint URL.

How do I run a Sparql query in Python?

To use as a command line script, you will need to install SPARQLWrapper and then a command line script called rqw (spaRQl Wrapper) will be available within the Python environment into which it is installed. run $ rql -h to see all the script's options.

What is RDFLib in Python?

RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information. This library contains parsers/serializers for almost all of the known RDF serializations, such as RDF/XML, Turtle, N-Triples, & JSON-LD, many of which are now supported in their updated form (e.g. Turtle 1.1).


4 Answers

Various things:

You are right, you need to enclose any URI within < >. The correct query is:

SELECT ?s ?o WHERE {
         ?s a <http://purl.org/ontology/mo/MusicArtist>;
            <http://www.w3.org/2002/07/owl#sameAs> ?o .
    } limit 50

... see the results here.

FROM is not implemented in rdflib or redland as you think it is. It does not fetch remote SPARQL endpoints it fetches remote graphs or graphs with that name in a local store. In your case you want to use SERVICE see how it works here with Jena. Unfortunately, neither rdflib nor redland implement the SERVICE clause for SPARQL but there are workarounds to sort this out.

One possible solution is to use SPARQLWrapper for python. It is trivial, here you have your code with that library:

from SPARQLWrapper import SPARQLWrapper, JSON

sparql = SPARQLWrapper("http://api.talis.com/stores/bbc-backstage/services/sparql")
sparql.setQuery("""
    SELECT ?s ?o
    WHERE {
         ?s a <http://purl.org/ontology/mo/MusicArtist>;
            <http://www.w3.org/2002/07/owl#sameAs> ?o .
    } limit 50
""")
sparql.setReturnFormat(JSON)
results = sparql.query().convert()

for result in results["results"]["bindings"]:
    print result["s"]['value'], result["o"]['value']

As you can see the remote SPARQL endpoint becomes a parameter outside the query.

like image 156
Manuel Salvadores Avatar answered Oct 05 '22 13:10

Manuel Salvadores


Redland does not currently support using SPARQL endpoints in FROM. What you are using here are are graph names that you load into the RDF Dataset. Also known as redland contexts when you load a triple (s, p, o) + c with something like model.context_add_statement(statement, context)

Rasqal GIT does support parsing SERVICE but not yet executing it in a query.

like image 28
dajobe Avatar answered Oct 05 '22 13:10

dajobe


You could also consider using Virtuoso with RedLand as it implement the SPARQL-FED "Service" param for remote query execution as demonstrated in these online examples

like image 32
hwilliams Avatar answered Oct 05 '22 15:10

hwilliams


There's another simple solution in the blog entry at http://terse-words.blogspot.com/2012/01/get-real-data-from-semantic-web.html which keeps the code fairly clean. It uses SPARQLWrapper as well.

like image 38
Col Wilson Avatar answered Oct 05 '22 15:10

Col Wilson