Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to query for people using Wikidata and SPARQL?

I'm new to SPARQL and Wikidata for that matter. I'm trying to allow my users to search Wikidata for people, and people only, I don't want any results to be a motorcycle brand or anything.

So I was playing around over here with the following query:

SELECT ?person ?personLabel WHERE {
  ?person wdt:P31 wd:Q5.
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en".
    ?person rdfs:label ?personLabel .
  }
  FILTER regex(?personLabel, "Albert", "i").
}
LIMIT 10

Though this eventually returns a result it is hardly as fast as I'd like it to be. Note that it also just times out if you try the above query with a name that's larger.

All the example queries work with, found here, assume that you already have an entity from which to query from. While in my case you have nothing to go on since I'm trying to query for someone with a certain name. I'm probably making some wrong assumptions about the inner workings of the database I'm working with but I'm not sure what they are though.

Any idea's?

like image 211
Prowling Duck Avatar asked Sep 29 '16 15:09

Prowling Duck


People also ask

How do I find wikidata?

In the query.wikidata.org query editor, you can press Ctrl + Space (or Alt + Enter or Ctrl + Alt + Enter ) at any point in the query and get suggestions for code that might be appropriate; select the right suggestion with the up / down arrow keys, and press Enter to select it.

What is a federated SPARQL query?

Federated query is the ability to take a query and provide solutions based on information from many different sources. A building block is the ability to have one query be able to issue a query on another SPARQL endpoint during query execution.

What is the use of SPARQL?

SPARQL can be used to express queries across diverse data sources, whether the data is stored natively as RDF or viewed as RDF via middleware. SPARQL contains capabilities for querying required and optional graph patterns along with their conjunctions and disjunctions.


1 Answers

The problem with doing a free text search with Wikidata is that it does not have a free text index (yet). Without an index text search requires trying a match for each label, which is not efficient. I could not come up with a query that searches for "Albert Einstein" and does not time out. An exact match (?person rdfs:label "Albert Einstein"@en .) does work, of course, but presumably that doesn't fit your needs. It would help if you could narrow down the selection of people in some other way first.

DBpedia (http://dbpedia.org/sparql), on the other hand, has Virtuoso's bif:contains available, so this works quite fast there (http://yasgui.org/short/HJeZ4kjp):

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT * WHERE {
  ?sub a foaf:Person .
  ?sub rdfs:label ?lbl .
  ?lbl bif:contains "Albert AND Einstein" .
  filter(langMatches(lang(?lbl), "en"))
} 
LIMIT 10
like image 178
evsheino Avatar answered Sep 19 '22 19:09

evsheino