Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get list of used values for some property

Tags:

wikidata

Can I get list of used values for some property? For example, I would like to get a list of all used distinct values of P166 (award received) property.

like image 423
Alexander Sigachov Avatar asked Dec 11 '14 12:12

Alexander Sigachov


2 Answers

UPDATE: this is now a trivial operation thank to the Wikidata Query Service, to which you can send SPARQL requests:

SELECT DISTINCT ?award WHERE {
  ?awarded_item wdt:P166 ?award .
}
  • try it in the GUI
  • get the results as JSON

OUTDATED:

A (brutal?) way to approach this would be to look for all the items that have this property, using the all-mighty wmflabs WDQ tool:

http://wdq.wmflabs.org/api?q=claim[166]

This returns 158 846 entity ids. You could use those to build a first impression of what values this property takes, using the offical Wikidata API (50 entities at a time maximum):

https://www.wikidata.org/w/api.php?action=wbgetentities&props=claims&format=json&ids=Q23|Q24|Q32|Q76|Q80|Q90|Q95|Q157|Q181|Q206|Q254|Q272|Q306|Q320|Q326|Q329|Q331|Q335|Q352|Q377|Q392|Q400|Q410|Q440|Q444|Q458|Q489|Q498|Q512|Q517|Q529|Q530|Q557|Q567|Q576|Q579|Q600|Q615|Q632|Q633|Q636|Q648|Q651|Q680|Q714|Q755|Q765|Q855|Q862|Q873|Q882

This would return a json with those entities' claims. You just have to do some (rather ugly) parsing (here in coffeescript, hope it's ok for you) to find what you are looking for:

properties = []
for entity, value of wikidataResponse.entities
  value.claims.P166.forEach (prop)->
    properties.push prop.mainsnak?.datavalue?.value?['numeric-id']

And (a little cleaning later) voila! The 122 values taken by the 50 first entities having the property P166:

[ 'Q17144', 'Q28003', 'Q31323', 'Q35637', 'Q37922', 'Q84020', 'Q93488', 'Q93716', 'Q93728', 'Q94121', 'Q103618', 'Q106301', 'Q120649', 'Q136733', 'Q145752', 'Q152337', 'Q154554', 'Q163700', 'Q178473', 'Q185493', 'Q208167', 'Q209896', 'Q218551', 'Q233454', 'Q278798', 'Q337463', 'Q465316', 'Q465774', 'Q541985', 'Q611968', 'Q680248', 'Q684511', 'Q697762', 'Q700899', 'Q721743', 'Q724443', 'Q758861', 'Q768999', 'Q805316', 'Q852071', 'Q858637', 'Q873842', 'Q896312', 'Q908745', 'Q908858', 'Q931502', 'Q963068', 'Q969644', 'Q976544', 'Q1059569', 'Q1063447', 'Q1081449', 'Q1123431', 'Q1139419', 'Q1141149', 'Q1316544', 'Q1357178', 'Q1364116', 'Q1415232', 'Q1442352', 'Q1465304', 'Q1543268', 'Q1599870', 'Q1789030', 'Q1818440', 'Q1818451', 'Q1853663', 'Q1969175', 'Q1991972', 'Q2325638', 'Q2329480', 'Q2465245', 'Q2536791', 'Q2547676', 'Q2727598', 'Q2990283', 'Q3295156', 'Q3324507', 'Q3403391', 'Q3405483', 'Q3519573', 'Q4273323', 'Q5593890', 'Q7241175', 'Q9052807', 'Q10855195', 'Q10855226', 'Q10855271', 'Q10905105', 'Q11599352', 'Q11609173', 'Q12177451', 'Q12201445', 'Q12201477', 'Q12270554', 'Q12981673', 'Q13422143', 'Q13452531', 'Q13554470', 'Q14539974', 'Q14539990', 'Q15117228', 'Q15229170', 'Q15278116', 'Q15631401', 'Q15710140', 'Q15831432', 'Q16141095', 'Q17099726', 'Q17200714', 'Q17355204', 'Q17373936' ]

That's already a good sample, but there is an import bias: the entities sample taken here being among the very first ones added to Wikidata (from Q23 to Q882), you will probably have a rather historically old and Western-centric set of P166 possible values. You might want to repeat this sampling with other parts of the 158 846 entities set (if not all).

like image 84
maxlath Avatar answered Oct 17 '22 17:10

maxlath


You could download a dump of Wikidata in RDF format and search that for all triples where the predicate is P166.

Probably the simplest way is to get the simplified dump (wikidata-simple-statements.nt.gz). In there, the property P166 is represented as a predicate with the URI http://www.wikidata.org/entity/P166c.

like image 2
svick Avatar answered Oct 17 '22 16:10

svick