Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

cURL - works in browser, not in cURL

I have been struggling for almost three days now on this task, and I guess I am missing on some basic cURL skills.

I start with:

In the F12 of IE I see 2 POSTs on the first page: (I notice the first one is getting a 302 which is supposed to be a redirect, and with cURL I only get 200)

Filling up the captcha:

on the second page (after captcha):

traffic:

This is my code (and I cannot move on with it because it doesn't work for the early stages):

I Built a special form that submits to my own page with GET (with the cURL) which in turn is accessing the website:

$id=$_GET['id']; // getting the biznumber
$humanCode=$_GET['nobot'];

$curl = curl_init();
curl_setopt ($curl, CURLOPT_URL, "https://www.*******.******.***");


// setting some https to be able to access the website from my local computer.
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, true);
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 2);
curl_setopt($curl, CURLOPT_CAINFO, "c:/xampp/htdocs/CAcerts/curl-ca-bundle.crt");

// I know the values for the ASPX vars like __EVENTTARGET, __EVENTARGUMENT, __VIEWSTATE are arbitrary now. I need to take care of that but I don't yet know how.

$postarr= array (
                "__EVENTTARGET"=>"",
                "__VIEWSTATE=" =>"%2FwEPDwULLTEzMzI2OTg4NDYPZBYCZg9kFgQCBA8PZBYCHgdvbmNsaWNrBQxnb1RvTWl2emFrKClkAgYPD2QWAh8ABQxnb1RvTWl2emFrKClkZM6iZZ0Qaf2CpfXoJJdZ0IqaWsDO",
                "__EVENTARGUMENT=" =>"",
                "__EVENTVALIDATION" =>"%2FwEWBQKgysLGCwL2r7SGDQLh4ri%2BAwLWws7NDwLWwpLPD%2F1HuCAFYzs2seaziWbYEXjDfigP",
                "hidUrlFileIshurim"=>"https%3A%2F,
"cod"=>"3322"
                    );

$fields_string='';
foreach($postarr as $key=>$value) { $fields_string .= $key.'='.$value.'&'; }
rtrim($fields_string,'&');

curl_setopt($curl, CURLOPT_POST      ,1);
curl_setopt($curl, CURLOPT_POSTFIELDS, $fields_string);

curl_setopt($curl, CURLOPT_TIMEOUT, 10);

curl_setopt ($curl, CURLOPT_USERAGENT, "User-Agent  Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0; MAAU)");

// I made a cookie file and it seems to work
$cookiefile = "d:/cookie.txt";
curl_setopt($curl, CURLOPT_COOKIEJAR, $cookiefile); 
curl_setopt($curl, CURLOPT_COOKIEFILE, $cookiefile);

curl_setopt($curl, CURLOPT_FRESH_CONNECT , 1);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION  ,1);

curl_setopt($curl, CURLOPT_HEADER      ,1);  // DO NOT RETURN HTTP HEADERS
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);

$temp=curl_exec($curl);
$info = curl_getinfo($curl);

$html = mb_convert_encoding($temp, 'HTML-ENTITIES', 'utf-8'); 
echo "ERRCODE: ".curl_error($curl);
echo '<br /><br />';
echo "INFO : ";
print_r($info);
echo '<br /><br />';
$httpcode = curl_getinfo($curl, CURLINFO_HTTP_CODE);
echo "CODE: ".$httpcode;
echo '<br /><br />';
echo "CODE: ".$httpcode;
echo '<br /><br />';
echo "VARS: ".$vars;
echo '<br /><br />';
//echo $html;

curl_setopt ($curl, CURLOPT_URL, "https://www.*******.******.***");
curl_setopt($curl, CURLOPT_FRESH_CONNECT , 0);

echo "<br /><br /><b>2nd</b><br /><br />";
$temp=curl_exec($curl);
$info = curl_getinfo($curl);

$html = mb_convert_encoding($temp, 'HTML-ENTITIES', 'utf-8'); 
echo "ERRCODE: ".curl_error($curl);
echo '<br /><br />';
echo "INFO : ";
print_r($info);
echo '<br /><br />';
echo $html;

Can't get that to even start to work. It starts with returning me a 200 OK, instead of 302, and sometimes I also get a 500.

I know the ASPX vars might actually be crucial, but if my browser can make these vars and send them to the server, can't cURL do the same ?

Thanks for any help !!

like image 413
Ted Avatar asked Apr 17 '12 17:04

Ted


2 Answers

Problem solved.
It was a matter of using the correct headers. Following the reports from the browser, I went through all steps and the result showed up.

I went through each step by using:

curl_init
curl_setopt()
..
curl_setopt()
curl_exec()
curl_close()

This way I had to manually set each request and go through the settings. It made the code longer, but much easier to understand.

I had thoughts about the site using some javascript special code to make the site work, so I was troubled a lot by all the extra, javascript code, which turned out unnecessary.

It was all about being alot more organized and following the correct header settings.

Moreover, since this was an ASPX site, I had to read and memorize the VIEWSTATE and VALIDATION of the last page in each iteration. That is the first and very reason for the 500 internal error server message I used to get all the time.

I used Firebug and LiveHttpHeaders to concolude each step.

like image 54
Ted Avatar answered Oct 19 '22 00:10

Ted


"Can't get that to even start to work. It starts with returning me a 200 OK, instead of 302, and sometimes I also get a 500."

curl_setopt($curl, CURLOPT_FOLLOWLOCATION  ,1);

You have Curl set to follow any 302 redirects. These will be followed internally inside of Curl and won't be seen by PHP.

Also:

curl_setopt($curl, CURLOPT_HEADER      ,1);  // DO NOT RETURN HTTP HEADERS

The comment does exactly the opposite of what the code does....which seems wrong.

like image 29
Danack Avatar answered Oct 18 '22 22:10

Danack