I am trying to use Rails to extract data from Wikipedia, based on a search term.
For example,
1) if I have the String "American Idol", I want to pass that to Wikipedia and get a list of the articles that relate to that. My goal will be to take the first 3 hyperlinks and display them on the website.
2) one step further would involve me extracting small pieces of data from Wikipedia - say the infobox, or the first few words of the wikipedia article.
Any tips?
Thanks!
What is the Wikipedia API? The Wikipedia API (official documentation) is supported by the MediaWiki's API and provide access to Wikipedia and other MediaWiki data without interacting with the user interface.
In the desktop view of Wikipedia, in the default skin and most others, the left-hand panel has a "Wikidata item" link, under " tools ". Copy the URL of that link, paste it into a text editor, and read (or copy) the ID from it.
Wikipedia and other Wikimedia projects are free, collaborative repositories of knowledge, written and maintained by volunteers from around the world. The Wikimedia API gives you open access to add this free knowledge to your projects and apps.
You don't need to resort to screen-scraping, MediaWiki has a very comprehensive API for precisely this kind of thing. See https://github.com/jpatokal/mediawiki-gateway for a handy Ruby wrapper around it.
Alternatively, if you're only interested in data like infoboxes, see DBpedia for the database version of Wikipedia.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With