The wikipedia api seems to almost always get the word in question wrong

Tags:

I'm using the wikipedia python library (https://pypi.org/project/wikipedia/), and in most cases, it seems to autocorrect the terms I'm using or something so that they're often wrong.

For instance, "frog" gets changed to "food" and "crown" gets changed to "cross":

input: wikipedia.page("frog")
output: <WikipediaPage 'Food'>

input: wikipedia.summary("Frog")
output: 'Food is any substance consumed to provide nutritional support for an organism..."

input: wikipedia.page("crown")
output: <WikipediaPage 'Cross'>

When using wikipedia.search, it seems to provide an appropriate list, but I don't know how to utilize this to get the correct page when using .summary, etc:

input: print(wikipedia.search("frog"))
output: ['Frog', 'FROG', 'The Princess and the Frog', 'Boiling frog', 'Frog legs', 'Frogger', 'The Scorpion and the Frog', 'Pepe the Frog', 'The Frog Prince', 'Common frog']

720

asked Mar 14 '21 23:03

Will

Video Answer

1 Answers

This is due to the default for auto_suggest on summary() being True.

According to the docs, you can change this to False and it will correctly return the summary for frog.

wikipedia.summary("Frog", auto_suggest=False)
#'A frog is any member of a diverse and largely carnivorous group of short-bodied, tailless amphibians composing the order Anura (literally without tail in Ancient Greek)

It seems, for whatever odd reason the the API's suggest() feature is... weird. It would likely be best to keep auto_suggest to False..

wikipedia.suggest("Frog")
#'food'
wikipedia.suggest("Steak")
#'steam'
wikipedia.suggest("Dog")
#'do'
wikipedia.suggest("cat")
#'cats'
wikipedia.suggest("david attenborough")
#None

146

answered Sep 25 '22 15:09

PacketLoss

Related questions
                            
                                TF-IDF vectorizer to extract ngrams
                            
                                Exclude a function from coverage
                            
                                List comprehension loop ordering depends on nesting [closed]
                            
                                After upgrade, raw sql queries return json fields as strings on postgres
                            
                                Modify all elements in a python list and change the type from string to integer
                            
                                How do I avoid type errors when internal function returns 'Union' that could be 'None'?
                            
                                Groupby and aggregate using lambda functions
                            
                                Can't get rid of unwanted stuff while scraping email addresses
                            
                                Comparison of np.random.choice vs np.random.shuffle for samples without replacement
                            
                                How does max_length, padding and truncation arguments work in HuggingFace' BertTokenizerFast.from_pretrained('bert-base-uncased') work??
                            
                                How can I check if a Python collection is ordered?
                            
                                How to config 'Completer.use_jedi' to 'False' in Juypter Notebook permanently
                            
                                How to Deal with Lat/Lon Arrays with Multiple Dimensions?
                            
                                Preform aggregation(s) on multiindex columns
                            
                                Cannot call Python function from Javascript in Notebook
                            
                                Same random numbers in C++ as computed by Python3 numpy.random.rand
                            
                                Writing data from a Python List and a Dictionary to CSV
                            
                                How to implement Grad-CAM on a trained network
                            
                                Poetry could not find a pyproject.toml file in C:\
                            
                                How to serialise and deserialise complex POCO data structures in Python to/from JSON

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

The wikipedia api seems to almost always get the word in question wrong

Tags:

python

wikipedia

wikipedia-api

Will

People also ask

Video Answer

1 Answers

PacketLoss

Recent Activity

Donate For Us