Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Save HTML Source Code to File

Tags:

python-3.x

How can I copy the source code of a website into a text file in Python 3?

EDIT: To clarify my issue, here's what I have:

import urllib.request

def extractHTML(url):
    f = open('temphtml.txt', 'w')
    page = urllib.request.urlopen(url)
    pagetext = page.read()
    f.write(pagetext)
    f.close()

extractHTML('http:www.google.com')

I get the following error for the f.write() function:

builtins.TypeError: must be str, not bytes
like image 766
user1306802 Avatar asked Apr 01 '12 20:04

user1306802


People also ask

How do I save a source code file?

To download a website's HTML source code, navigate using your favorite browser to the page, and then select SAVE PAGE AS from the FILE menu. You'll then be prompted to select whether you want to download the whole page (including images) or just the source code.


1 Answers

import urllib.request
site = urllib.request.urlopen('http://somesite.com')
data = site.read()
file = open("file.txt","wb") #open file in binary mode
file.writelines(data)
file.close()

Untested but should work.

EDIT: Updated for python3

like image 99
Jack Avatar answered Nov 15 '22 07:11

Jack