I've been working on a WebCrawler written in C# using System.Windows.Forms.WebBrowser. I am trying to download a file off a website and save it on a local machine. More importantly, I would like this to be fully automated. The file download can be started by clicking a button that calls a javascript function that sparks the download displaying a “Do you want to open or save this file?” dialog. I definitely do not want to be manually clicking “Save as”, and typing in the file name.
I am aware of HttpWebRequest and WebClient’s download functions, but since the download is started with a javascript, I do now know the URL of the file. Fyi, the javascript is a doPostBack function that changes some values and submits a form.
I’ve tried getting focus on the save as dialog from WebBrowser to automate it from in there without much success. I know there’s a way to force the download to save instead of asking to save or open by adding a header to the http request, but I don’t know how to specify the filepath to download to.
I think you should prevent the download dialog from even showing. Here might be a way to do that:
The Javascript code causes your WebBrowser control to navigate to a specific Url (what would cause the download dialog to appear)
To prevent the WebBrowser control from actually Navigating to this Url, attach a event handler to the Navigating event.
In your Navigating event you'd have to analyze if this is the actual Navigation action you'd want to stop (is this one the download url, perhaps check for a file extension, there must be a recognizable format). Use the WebBrowserNavigatingEventArgs.Url to do so.
If this is the right Url, stop the Navigation by setting the WebBrowserNavigatingEventArgs.Cancel property.
Continue the download yourself with the HttpWebRequest or WebClient classes
Have a look at this page for more info on the event:
http://msdn.microsoft.com/en-us/library/system.windows.forms.webbrowser.navigating.aspx
A similar solution is available at http://social.msdn.microsoft.com/Forums/en/csharpgeneral/thread/d338a2c8-96df-4cb0-b8be-c5fbdd7c9202/?prof=required
This work perfectly if there is direct URL including downloading file-name.
But sometime some URL generate file dynamically. So URL don't have file name but after requesting that URL some website create file dynamically and then open/save dialog comes.
for example some link generate pdf file on the fly.
How to handle such type of URL?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With