Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Download HTML page and its contents

Tags:

python

html

Does Python have any way of downloading an entire HTML page and its contents (images, css) to a local folder given a url. And updating local html file to pick content locally.

like image 373
bocca Avatar asked Dec 01 '09 10:12

bocca


People also ask

How do I download an entire HTML page?

Save a Web Page in Chrome You can also right-click anywhere on the page and select Save as or use the keyboard shortcut Ctrl + S in Windows or Command + S in macOS. Chrome can save the complete web page, including text and media assets, or just the HTML text.


1 Answers

You can use the urllib module to download individual URLs but this will just return the data. It will not parse the HTML and automatically download things like CSS files and images.

If you want to download the "whole" page you will need to parse the HTML and find the other things you need to download. You could use something like Beautiful Soup to parse the HTML you retrieve.

This question has some sample code doing exactly that.

like image 161
Dave Webb Avatar answered Oct 09 '22 02:10

Dave Webb