Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get all links and their Wikidata IDs for a Wikipedia page?

(When) will the following be possible?

  • get the list of all links on a Wikipedia page with their respective Wikidata IDs in a single query/API call.

  • receive additional information of the respective Wikidata items like a property value with the query.

like image 768
G. L. Merebet Avatar asked May 07 '16 21:05

G. L. Merebet


People also ask

How do I get my Wikidata ID from Wikipedia?

In the desktop view of Wikipedia, in the default skin and most others, the left-hand panel has a "Wikidata item" link, under " tools ". Copy the URL of that link, paste it into a text editor, and read (or copy) the ID from it.

How do I get data from Wikidata?

You can query the data in Wikidata through our SPARQL endpoint, the Wikidata Query Service. The service can be used both as an interactive web interface, or programmatically by submitting GET or POST requests to https://query.wikidata.org/sparql .

How do I link Wikipedia to Wikidata?

Access from Wikipedia From a Wikipedia page, you can go to the link "Wikidata item", using "Tools" in the side panel (in the left), to see and edit it. Also in Tools, there is another link to "page information", where is "Wikidata item ID", that contains the QID (for example: Q171 or "None").


1 Answers

To get all Wikipedia page links you have to use Wikipedia API, and to get all Wikidata item properties you need Wikidata API, so it is not possible to create one query with two requests to both APIs. But! The first part of your question is already possible. And about the second... you didn't say anything for this what information you need from Wikidata.

You can get Wikidata IDs and a lot of other information for all Wikipedia page links, like coordinates, refs, internal and external links, images, text content, contributors, history, page rights, categories, templates etc... To do this we can use only Wikipedia API because our entry point is the Wikipedia page, plus generator feature of the API.

For example, this is how to get Wikidata ID, short intro text and the main image for first 20 interwiki links on Dolphin Wikipedia page:

https://en.wikipedia.org/w/api.php?action=query&generator=links&format=xml&redirects=1&titles=Dolphin&prop=pageprops|extracts|pageimages&gpllimit=20&ppprop=wikibase_item&exintro=1&exlimit=20&piprop=name&pilimit=20

Main query parameters:

  • action=query&format=xml&redirects=1&titles=Dolphin
  • generator=links - to get all page links (works together with gpllimit=20)
  • prop=pageprops|extracts|pageimages - what to get from the links

Properties:

  • pageprops - to get Wikidata ID (works with ppprop=wikibase_item)
  • extracts - to get first text lines from that page (works with exintro=1 and exlimit=20)
  • pageimages - to get main image (works with piprop=name and pilimit=20)

In the same way you can get and another information listed here in parameter prop.

like image 56
Termininja Avatar answered Oct 19 '22 18:10

Termininja