Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Downloading file using IE from python

I'm trying to download file with Python using IE:

from win32com.client import DispatchWithEvents

class EventHandler(object):
    def OnDownloadBegin(self):
        pass

ie = DispatchWithEvents("InternetExplorer.Application", EventHandler)

ie.Visible = 0

ie.Navigate('http://website/file.xml')

After this, I'm getting a window asking the user where to save the file. How can I save this file automatically from python?

I need to use some browser, not urllib or mechanize, because before downloading file I need to interact with some ajax functionality.

like image 578
Adam Avatar asked Sep 09 '09 10:09

Adam


4 Answers

This works for me as long as the IE dialogs are in the foreground and the downloaded file does not already exist in the "Save As" directory:

import time
import threading
import win32ui, win32gui, win32com, pythoncom, win32con
from win32com.client import Dispatch

class IeThread(threading.Thread):
    def run(self):
        pythoncom.CoInitialize()
        ie = Dispatch("InternetExplorer.Application")
        ie.Visible = 0
        ie.Navigate('http://website/file.xml')

def PushButton(handle, label):
    if win32gui.GetWindowText(handle) == label:
        win32gui.SendMessage(handle, win32con.BM_CLICK, None, None)
        return True

IeThread().start()
time.sleep(3)  # wait until IE is started
wnd = win32ui.GetForegroundWindow()
if wnd.GetWindowText() == "File Download - Security Warning":
    win32gui.EnumChildWindows(wnd.GetSafeHwnd(), PushButton, "&Save");
    time.sleep(1)
    wnd = win32ui.GetForegroundWindow()
if wnd.GetWindowText() == "Save As":
    win32gui.EnumChildWindows(wnd.GetSafeHwnd(), PushButton, "&Save");
like image 125
cgohlke Avatar answered Nov 15 '22 00:11

cgohlke


I don't know how to say this nicely, but this sounds like about the most foolhardy software idea in recent memory. Python is much more capable of performing AJAX calls than IE is.

To access the data, yes, you can use urllib and urllib2 . If there is JSON data in the response, there's the json library; likewise for XML and HTML, there's BeautifulSoup.

For one project, I had to write a Python program that would simulate a browser and sign into any of 20 different social networks (remember Friendster? Orkut? CyberWorld? I do), and upload images and text into the user's account, even grasping CAPTCHAs and complex JavaScript interactions. Pure Python makes it (comparatively) easy; as you've already seen, trying to use IE makes it impossible.

like image 24
Michael Lorton Avatar answered Nov 15 '22 00:11

Michael Lorton


pamie perhaps

P.A.M.I.E. - stands for Python Automated Module For I.E.

Pamie's main use is for testing web sites by which you automate the Internet Explorer client using the Pamie scripting language. PAMIE is not a record playback engine!

Pamie allows you to automate I.E. by manipulating I.E.'s Document Object Model via COM. This Free tool is for use by Quality Assurance Engineers and Developers.

like image 3
John La Rooy Avatar answered Nov 14 '22 23:11

John La Rooy


If you can't control Internet Explorer using its COM interface, I suggest using the AutoIt COM to control its GUI from Python.

like image 1
Cristian Ciupitu Avatar answered Nov 15 '22 01:11

Cristian Ciupitu