Downloading a image using Python Mechanize

Question

I'm trying to write a Python script to download a image and set it as my wallpaper. Unfortunately, the Mechanize documentation is quite poor. My script is following the link correctly, but I'm having a hard time to actually save the image on my computer. From what I researched, the .retrieve() method should do the work, but How do I specify the path to where the file should be downloaded to? Here is what I have...

def followLink(browser, fixedLink):
    browser.open(fixedLink)

if browser.find_link(url_regex = r'1600x1200'):

    browser.follow_link(url_regex = r'1600x1200')

elif browser.find_link(url_regex = r'1400x1050'):

    browser.follow_link(url_regex = r'1400x1050')

elif browser.find_link(url_regex = r'1280x960'):

    browser.follow_link(url_regex = r'1280x960')

 return

zhangyangyu · Accepted Answer

import mechanize, os
from BeautifulSoup import BeautifulSoup

browser = mechanize.Browser()
html = browser.open(url)
soup = BeautifulSoup(html)
image_tags = soup.findAll('img')
for image in image_tags:
    filename = image['src'].lstrip('http://')
    filename = os.path.join(dir, filename.replace('/', '_'))
    data = browser.open(image['src']).read()
    browser.back()
    save = open(filename, 'wb')
    save.write(data)
    save.close()

This can help you download all the images from a web page. As for parsing html you'd better use BeautifulSoup or lxml. And download is just read the data and then write it to a local file. You should assign your own value to dir. It is where you images exist.

0xC0000022L · Answer

Not sure why this solution hasn't come up, but you can use the mechanize.Browser.retrieve function as well. Perhaps this only works in newer versions of mechanize and has thus not been mentioned?

Anyway, if you wanted to shorten the answer by zhangyangyu, you could do this:

import mechanize, os
from BeautifulSoup import BeautifulSoup

browser = mechanize.Browser()
html = browser.open(url)
soup = BeautifulSoup(html)
image_tags = soup.findAll('img')
for image in image_tags:
    filename = image['src'].lstrip('http://')
    filename = os.path.join(dir, filename.replace('/', '_'))
    browser.retrieve(image['src'], filename)
    browser.back()

Also keep in mind that you'll likely want to put all of this into a try except block like this one:

import mechanize, os
from BeautifulSoup import BeautifulSoup

browser = mechanize.Browser()
html = browser.open(url)
soup = BeautifulSoup(html)
image_tags = soup.findAll('img')
for image in image_tags:
    filename = image['src'].lstrip('http://')
    filename = os.path.join(dir, filename.replace('/', '_'))
    try:
        browser.retrieve(image['src'], filename)
        browser.back()
    except (mechanize.HTTPError,mechanize.URLError) as e:
        pass
        # Use e.code and e.read() with HTTPError
        # Use e.reason.args with URLError

Of course you'll want to adjust this to your needs. Perhaps you want it to bomb out if it encounters an issue. It totally depends on what you want to achieve.

Downloading a image using Python Mechanize

Tags:

python

mechanize

XVirtusX

2 Answers

zhangyangyu

0xC0000022L

Recent Activity

Donate For Us

Downloading a image using Python Mechanize

Tags:

python

mechanize

XVirtusX

2 Answers

zhangyangyu

0xC0000022L

Related questions

Recent Activity

Donate For Us