I'm trying to download remote html pages with my C++ program, however with some URLs a timeout occurs, but I don't know how to handle this, so the program will just hang indefinatly.
virtual void downloadpage(string pageaddress) {
CURL *curl;
CURLcode informationdownloaded;
curl = curl_easy_init();
if (curl) {
curl_easy_setopt(curl, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.A.B.C Safari/525.13");
curl_easy_setopt(curl, CURLOPT_URL, pageaddress.c_str());
curl_easy_setopt(curl, CURLOPT_HEADER, 0);
curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, writepageinformation);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &pageinformation);
informationdownloaded = curl_easy_perform(curl);
curl_easy_cleanup(curl);
}
}
Here is my function for downloading the html source of a page into a string variable called "pageinformation" via the "writepageinformation" function.
informationdownloaded = curl_easy_perform(curl);
You can also specify timeout for your download
curl_easy_setopt(hCurl, CURLOPT_TIMEOUT, iTimeoutSeconds); // timeout for the URL to download
This is a blocked call until the entire file is downloaded. If you are interested to interrupt the blocked call (for signal to kill) install a progress callback, like below
curl_easy_setopt(hCurl, CURLOPT_NOPROGRESS, 0);
curl_easy_setopt(hCurl, CURLOPT_PROGRESSFUNCTION, progress_callback);
curl_easy_setopt(hCurl, CURLOPT_PROGRESSDATA, this);
static int progress_callback(void *clientp,
double dltotal,
double dlnow,
double ultotal,
double ulnow)
{
CLASS &obj = *(CLASS*)clientp;
if (obj.exit)
return 1; // if u want the signal curl to unblock and return from curl_easy_perform
return 0; // if u want the callback to continue
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With