Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I get Wikipedia content using Wikipedia's API?

I want to get the first paragraph of a Wikipedia article.

What is the API query to do so?

like image 656
bbnn Avatar asked Aug 25 '11 04:08

bbnn


People also ask

How do I extract content from Wikipedia?

In order to extract data from Wikipedia, we have to first import the wikipedia library in Python using 'pip install wikipedia'. In this program, we will extract the summary of Python Programming from Wikipedia and print it inside a textbox.

Is there an API for Wikipedia?

What is the Wikipedia API? The Wikipedia API (official documentation) is supported by the MediaWiki's API and provide access to Wikipedia and other MediaWiki data without interacting with the user interface.

Can we copy content from Wikipedia?

Wikipedia content can be copied, modified, and redistributed if and only if the copied version is made available on the same terms to others and acknowledgment of the authors of the Wikipedia article used is included (a link back to the article is generally thought to satisfy the attribution requirement; see below for ...

How do I get plain text from Wikipedia?

explaintext => Return extracts as plain text instead of limited HTML. exlimit = max (now its 20); Otherwise only one result will return. exintro => Return only content before the first section. If you want full data, just remove this.


1 Answers

See this section in the MediaWiki documentation.

These are the key parameters.

prop=revisions&rvprop=content&rvsection=0 

rvsection = 0 specifies to only return the lead section.

See this example.

http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&rvsection=0&titles=pizza

To get the HTML, you can use similarly use action=parse http://en.wikipedia.org/w/api.php?action=parse&section=0&prop=text&page=pizza

Note that you'll have to strip out any templates or infoboxes.

like image 69
Gabe Avatar answered Sep 21 '22 08:09

Gabe