I'm using the wikipedia python library (https://pypi.org/project/wikipedia/), and in most cases, it seems to autocorrect the terms I'm using or something so that they're often wrong.
For instance, "frog" gets changed to "food" and "crown" gets changed to "cross":
input: wikipedia.page("frog")
output: <WikipediaPage 'Food'>
input: wikipedia.summary("Frog")
output: 'Food is any substance consumed to provide nutritional support for an organism..."
input: wikipedia.page("crown")
output: <WikipediaPage 'Cross'>
When using wikipedia.search, it seems to provide an appropriate list, but I don't know how to utilize this to get the correct page when using .summary, etc:
input: print(wikipedia.search("frog"))
output: ['Frog', 'FROG', 'The Princess and the Frog', 'Boiling frog', 'Frog legs', 'Frogger', 'The Scorpion and the Frog', 'Pepe the Frog', 'The Frog Prince', 'Common frog']
To get the complete plain text content of a Wikipedia page (excluding images, tables, etc.), we can use the content attribute of the page object.
This is due to the default for auto_suggest
on summary()
being True
.
According to the docs, you can change this to False
and it will correctly return the summary for frog
.
wikipedia.summary("Frog", auto_suggest=False)
#'A frog is any member of a diverse and largely carnivorous group of short-bodied, tailless amphibians composing the order Anura (literally without tail in Ancient Greek)
It seems, for whatever odd reason the the API's
suggest()
feature is... weird.
It would likely be best to keep auto_suggest
to False
..
wikipedia.suggest("Frog")
#'food'
wikipedia.suggest("Steak")
#'steam'
wikipedia.suggest("Dog")
#'do'
wikipedia.suggest("cat")
#'cats'
wikipedia.suggest("david attenborough")
#None
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With