Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get property labels from Wikidata using SPARQL

I am using SPARQLWrapper to send SPARQL queries to Wikidata. At the moment I am trying to find all properties for an entity. Eg. with a simple tuple such as: wd:Q11663 ?a ?b. This in itself works, but I am trying to find human readable labels for the returned properties and entities.

Although SERVICE wikibase:label works using Wikidata's GUI interface, this does not work with SPARQLWrapper - which insists on returning identical values for a variable and its 'label'.

Querying on the property rdfs:label works for the entity (?b), but this approach does not work with the property (?a).

it would appear the property is being returned as a full URI such as http://www.wikidata.org/prop/direct/P1536 . Using the GUI I can successfully query wd:P1536 ?a ?b.. This works with SPARQLWrapper if I send it as a second query - but not in the first query.

Here is my code:

from SPARQLWrapper import SPARQLWrapper, JSON

sparql = SPARQLWrapper("http://query.wikidata.org/sparql")

sparql.setQuery("""
  SELECT ?a ?aLabel ?propLabel ?b ?bLabel
  WHERE
  {
    wd:Q11663 ?a ?b.

    # Doesn't work with SPARQLWrapper
    #SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    #?prop wikibase:directClaim ?p

    # but this does (and is more portable)
    ?b rdfs:label ?bLabel. filter(lang(?bLabel) = "en").

    # doesn't work
    #?a rdfs:label ?aLabel. 

    # property code can be extracted successfully
    BIND(  strafter(str(?a), "prop/direct/") AS ?propLabel).
    #BIND( CONCAT("wd:", strafter(str(?a), "prop/direct/") ) AS ?propLabel).

    # No matches, even if I concat 'wd:' to ?propLabel
    ?propLabel rdfs:label ?aLabel
    # generic search for any properties also fails
    #?propLabel ?zz ?aLabel.
   }
 """)

# However, this returns a label for P1536 - which is one of wd:Q11663's properties
sparql.setQuery("""SELECT ?b WHERE
   {
      wd:P1536 rdfs:label ?b.
   }
""")

So how can I get the labels for the properties in one query (which should be more efficient)?

[aside: yes I'm a bit rough & ready with the EN filter - often dropping it if I'm not getting anything back]

like image 574
winwaed Avatar asked Jun 07 '19 01:06

winwaed


People also ask

How do I get data from wikidata?

You can query the data in Wikidata through our SPARQL endpoint, the Wikidata Query Service. The service can be used both as an interactive web interface, or programmatically by submitting GET or POST requests to https://query.wikidata.org/sparql .

What is label in wikidata?

The label is the most common name that the item would be known by. It does not need to be unique, in that multiple items can have the same label, however no two items may have both the same label and the same description.

What is Sparql used for?

SPARQL, short for “SPARQL Protocol and RDF Query Language”, enables users to query information from databases or any data source that can be mapped to RDF. The SPARQL standard is designed and endorsed by the W3C and helps users and developers focus on what they would like to know instead of how a database is organized.

What is WDT in wikidata?

The /entity/ (wd:) represents Wikidata entity (Q-number values). The /prop/direct/ (wdt:) is a "truthy" property — a value we would expect most often when looking at the statement. The truthy properties are needed because some statements could be "truer" than others.


1 Answers

I was having problems with two approaches - and the code above contains a mixture of both. Also, SPARQLWrapper isn't a problem here.

The first approach using the wikibase Label service should be like this:

SELECT ?a ?aLabel ?propLabel ?b ?bLabel
WHERE
{
  ?item rdfs:label "weather"@en.
  ?item ?a ?b.

  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". } 
  ?prop wikibase:directClaim ?a .
}

This code also includes a lookup from the label ('weather') to the query entity (?item).

The SERVICE was working, but if there isn't an rdfs:label definition then it just returns the entity. The GUI and SPARQLWrapper (to the SPARQL endpoint) were simply returning the results in a different order - so it looked like I was seeing lots of 'failed' output (ie. entities and failed labels both being reported as the same).

This became clear when I started adding an OPTIONAL clause to the approach below.

The ?prop wikibase:directClaim ?a . line turns out to be pretty simple. Wikibase defines directClaim to map properties to entities. This then allows it to define tuples about properties (ie. a label). Many other ontologies just use the same identifiers.

My second (more generic approach) is the approach you find in many of the books and online tutorials. The problem here is that wikibase's properties have the full URL in them, and I needed to convert them into an entity. I tried string manipulation but this produces a string literal - not an entity. The solution is to use directClaim again:

?prop wikibase:directClaim ?a .
?prop rdfs:label ?propLabel.  filter(lang(?propLabel) = "en").

Note that this only returns a result if rdfs:label is defined. Adding an OPTIONAL will return results even if there is no label defined.

like image 137
winwaed Avatar answered Sep 29 '22 21:09

winwaed