Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python request-html is not downloading Chromium

import requests
from bs4 import BeautifulSoup
from requests_html import HTMLSession
url="https://dmarket.com/ingame-items/item-list/csgo-skins?title=recoil%20case"
sesion = HTMLSession()
response = sesion.get(url)
response.html.render()
soup = BeautifulSoup(response.html.html, features="html.parser")
print(soup)

After run it said

[INFO] Starting Chromium download.

After that crashes with this in VS Code:

Chromium downloadable not found at https://storage.googleapis.com/chromium-browser-snapshots/Win_x64/1181205/chrome-win.zip: Received <?xml version='1.0' encoding='UTF-8'?><Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Details>No such object: chromium-browser-snapshots/Win_x64/1181205/chrome-win.zip</Details></Error>

I tried installing different versions of requests_html

like image 348
Mihali George Avatar asked Apr 13 '26 06:04

Mihali George


2 Answers

UPDATE

Thanks to @Abdul Aziz Barkat's comment, it turns out that you can specify chromium version through environment variables, and pyppeteer will use it.

PYPPETEER_CHROMIUM_REVISION = '1263111'

requests-html uses pyppeteer library to download chromium, and it looks like version 1181205 of chromium which is hardcoded in pyppeteer has been removed from google storage.

Since requests-html installs pyppeteer with it, a simple workaround can be updating line 20 of pyppeteer's __init__.py file in your env:

__chromium_revision__ = '1181205' -> __chromium_revision__ = '1263111'

Note: I used version 1263111 because it's the latest for Win_x64 at the time of answering, and it works fine.

like image 62
Lk4m1 Avatar answered Apr 15 '26 19:04

Lk4m1


I tried the solution by @Lk4m1. This setup worked for me :

import asyncio
import os

PYPPETEER_CHROMIUM_REVISION = '1263111'

os.environ['PYPPETEER_CHROMIUM_REVISION'] = PYPPETEER_CHROMIUM_REVISION

from pyppeteer import launch


async def generate_pdf(url, pdf_path):
    browser = await launch()
    page = await browser.newPage()
    
    await page.goto(url)
    
    await page.pdf({'path': pdf_path, 'format': 'A4'})
    
    await browser.close()

# Run the function
asyncio.get_event_loop().run_until_complete(generate_pdf('https://example.com', 'example.pdf'))
like image 25
ayoubachak Avatar answered Apr 15 '26 19:04

ayoubachak



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!