I'm trying to figure out how to check if a Wikipedia article exists. For example,
https://en.wikipedia.org/wiki/Food
exists, however
https://en.wikipedia.org/wiki/Fod
does not, and the page simply says, "Wikipedia does not have an article with this exact name."
Thanks!
Wikipedia is a Python library that makes it easy to access and parse data from Wikipedia. Search Wikipedia, get article summaries, get data like links and images from a page, and more. Wikipedia wraps the MediaWiki API so you can focus on using Wikipedia data, not getting it. >>> import wikipedia >>> print wikipedia.
To get the complete plain text content of a Wikipedia page (excluding images, tables, etc.), we can use the content attribute of the page object.
In the desktop view of Wikipedia, in the default skin and most others, the left-hand panel has a "Wikidata item" link, under " tools ". Copy the URL of that link, paste it into a text editor, and read (or copy) the ID from it.
>>> import urllib
>>> print urllib.urlopen("https://en.wikipedia.org/wiki/Food").getcode()
200
>>> print urllib.urlopen("https://en.wikipedia.org/wiki/Fod").getcode()
404
is it ok?
or
>>> a = urllib.urlopen("https://en.wikipedia.org/wiki/Fod").getcode()
>>> if a == 404:
... print "Wikipedia does not have an article with this exact name."
...
Wikipedia does not have an article with this exact name.
Basicly, most website or web service will announce some status from each your HTTP request in the HTTP response header.
In your case, you can simply find the status code if is 404 while the article is not existed even though your brower rendered a page like a normol result.
import request
result = request.get('https://en.wikipedia.org/wiki/Food')
if result.status_code == 200: # the article exists
pass # blablabla
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With