Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SPARQL filter lang 'en' gives other languages

Tags:

sparql

dbpedia

The following SPARQL query doesn't get the results I want because they are in other languages than English, regardless of the filter lang 'en' (see filters in query).

Results of the query :

"Никола́й Ива́нович Буха́рин"@en    "Никола́й Буха́рин"@en  "Nikolai Bukharin"@en
"Gamal Abdel Nasser Hussein"@en     "جمال عبد الناصر"@en    "Gamal Abdel Nasser"@en

I looked at the DBpedia page and I saw that there is the English version of the names, but I don't see why the filter doesn't work !!!

Can someone help me with that ?

PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbpedia: <http://dbpedia.org/property/>
SELECT DISTINCT ?person ?birthname ?nameExact ?label
where {

     ?person rdf:type dbpedia-owl:Person .
     ?person rdfs:label ?label .
     OPTIONAL { ?person dbpedia-owl:birthName ?birthname . }
     OPTIONAL { ?person dbpprop:name ?nameExact . }

     FILTER (lang(?birthname) = 'en')
     FILTER (lang(?label) = 'en')
     FILTER (lang(?nameExact) = 'en')

}
LIMIT 300
like image 465
Funmatica Avatar asked Sep 11 '12 16:09

Funmatica


2 Answers

Be careful, to prefix, you must use the same in declaration and query (dbo -> dbo, not dbo -> dbpedia-owl)

PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>

SELECT DISTINCT ?person ?birthname ?nameExact ?label
where {

     ?person rdf:type dbo:Person .
     ?person rdfs:label ?label .
     OPTIONAL { ?person dbo:birthName ?birthname . }
     OPTIONAL { ?person dbp:name ?nameExact . }

     FILTER (lang(?birthname) = 'en')
     FILTER (lang(?label) = 'en')
     FILTER (lang(?nameExact) = 'en')

}

LIMIT 300
like image 170
user6312556 Avatar answered Nov 06 '22 13:11

user6312556


The language tag is an annotation in the database. Your filter works correctly. Some of the values in the database are annotated with en even though they are in different scripts. You'll need to write your own logic that selects the most appropriate property. I'd probably just use the rdfs:label property and cut off anything in brackets (like in "Black Hawk (Sauk leader)"@en). That seems to provide decent results.

Also note that you need to put the FILTERs for ?birthname and ?nameExact into the respective OPTIONAL block, otherwise they will end up removing any matches that didn't have the optional property.

like image 21
cygri Avatar answered Nov 06 '22 12:11

cygri