I am grabbing a Wikia page using Python requests. There's a problem, though: the requests request isn't giving me the same HTML as my browser is with the very same page.
For comparison, here's the page Firefox gets me, and here's the page requests fetches (download them to view - sorry, no easy way to just visually host a bit of HTML from another site).
You'll note a few differences (super unfriendly diff). There are some small things, like attributes beinig ordered differently and such, but there are also a few very, very large things. Most important is the lack of the last six <img>
s, and the entirety of the navigation and footer sections. Even in the raw HTML it looks like the page cut off abruptly.
Why is this happening, and is there a way to fix it? I've thought of a bunch of things already, none of which have been fruitful:
User-Agent
and all, 1:1 into the requests request, but nothing changed.It'd be amazing if you know a way this could happen and a way to fix it. Thank you!
The Python error "ModuleNotFoundError: No module named 'requests'" occurs for multiple reasons: Not having the requests package installed by running pip install requests . Installing the package in a different Python version than the one you're using. Installing the package globally and not in your virtual environment.
requests - Easily the most popular package for making requests using Python. urllib3 - Not to be confused with urllib , which is part of the Python standard library.
Requests-HTML: HTML Parsing for Humans™ This library intends to make parsing HTML (e.g. scraping the web) as simple and intuitive as possible.
Project description. Requests is an Apache2 Licensed HTTP library, written in Python, for human beings.
I had a similar issue:
To resolve the issue, I ended up swapping out the requests library for urllib.request.
Basically, I replaced:
import requests session = requests.Session() r = session.get(URL)
with:
import urllib.request r = urllib.request.urlopen(URL)
and then it worked.
Maybe one of those libraries is doing something strange behind the scenes? Not sure if that's an option for you or not.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With