Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

get all sections separately with wikimedia api

Tags:

api

mediawiki

I try to get all seperate sections of a wikipedia article through the api.

I know already :

  • Howto retrieve a complete text :

    http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvlimit=1&titles=house&rvprop=content

  • Howto retrieve a specific section of the text:

    http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvlimit=1&titles=house&rvprop=content&rvsection=0

Howto retrieve all sections seperately with one request ? (JSON Array for example)

like image 799
mcfly soft Avatar asked Nov 28 '14 17:11

mcfly soft


2 Answers

What you ask is called parsing, because it requires interpretation of the wikitext source to split the page by sections etc. So the solution is given in https://www.mediawiki.org/wiki/API:Parsing_wikitext

1) Get the list of sections: https://www.mediawiki.org/w/api.php?action=parse&page=API:Parsing_wikitext&prop=sections

2) Ask the parsed wikitext of that section: https://www.mediawiki.org/w/api.php?action=parse&page=API:Parsing_wikitext&section=1&prop=text

like image 194
Nemo Avatar answered Oct 20 '22 19:10

Nemo


I realize this question was asked four years ago, so possibly the following was not available then:

You can use the REST API described here: https://www.mediawiki.org/wiki/REST_API

The REST endpoints are described/documented here: https://en.wikipedia.org/api/rest_v1/#/

The mobile-sections endpoint (intended for consuming info for a mobile device) gives you a nice breakdown with headings, which sounds like what you are asking for.

Alternatively, the metadata endpoint returns a toc (table of contents) section which contains the same breakdown of headings.

Here is an example URL, fetching the mobile sections for the "Egyptian pyramids" page: https://en.wikipedia.org/api/rest_v1/page/mobile-sections/Egyptian_pyramids

The advantage is that the response is in JSON format (which is what you were asking for).

like image 6
mydoghasworms Avatar answered Oct 20 '22 19:10

mydoghasworms