Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The wikipedia api seems to almost always get the word in question wrong

I'm using the wikipedia python library (https://pypi.org/project/wikipedia/), and in most cases, it seems to autocorrect the terms I'm using or something so that they're often wrong.

For instance, "frog" gets changed to "food" and "crown" gets changed to "cross":

input: wikipedia.page("frog")
output: <WikipediaPage 'Food'>

input: wikipedia.summary("Frog")
output: 'Food is any substance consumed to provide nutritional support for an organism..."

input: wikipedia.page("crown")
output: <WikipediaPage 'Cross'>

When using wikipedia.search, it seems to provide an appropriate list, but I don't know how to utilize this to get the correct page when using .summary, etc:

input: print(wikipedia.search("frog"))
output: ['Frog', 'FROG', 'The Princess and the Frog', 'Boiling frog', 'Frog legs', 'Frogger', 'The Scorpion and the Frog', 'Pepe the Frog', 'The Frog Prince', 'Common frog']
like image 720
Will Avatar asked Mar 14 '21 23:03

Will


People also ask

Which attribute of a Wikipedia page object will have the HTML contents of a Wikipedia page?

To get the complete plain text content of a Wikipedia page (excluding images, tables, etc.), we can use the content attribute of the page object.


Video Answer


1 Answers

This is due to the default for auto_suggest on summary() being True.

According to the docs, you can change this to False and it will correctly return the summary for frog.

wikipedia.summary("Frog", auto_suggest=False)
#'A frog is any member of a diverse and largely carnivorous group of short-bodied, tailless amphibians composing the order Anura (literally without tail in Ancient Greek)

It seems, for whatever odd reason the the API's suggest() feature is... weird. It would likely be best to keep auto_suggest to False..

wikipedia.suggest("Frog")
#'food'
wikipedia.suggest("Steak")
#'steam'
wikipedia.suggest("Dog")
#'do'
wikipedia.suggest("cat")
#'cats'
wikipedia.suggest("david attenborough")
#None 
like image 146
PacketLoss Avatar answered Sep 25 '22 15:09

PacketLoss