I'm looking to cURL a URL and keep track of each individual URL it goes through. For some reason I am unable to accomplish this without doing recursive cURL calls which is not ideal. Perhaps I am missing some easy option. Thoughts?
$url = "some url with redirects";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_NOBODY, false);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1) Gecko/20061024 BonEcho/2.0");
$html = curl_exec($ch);
$info = array();
if(!curl_errno($ch))
{
$info = curl_getinfo($ch);
echo "<pre>";
print_r($info);
echo "</pre>";
}
and I get a response like this
Array
(
[url] => THE LAST URL THAT WAS HIT
[content_type] => text/html; charset=utf-8
[http_code] => 200
[header_size] => 1942
[request_size] => 1047
[filetime] => -1
[ssl_verify_result] => 0
[redirect_count] => 2 <---- I WANT THESE
[total_time] => 0.799589
[namelookup_time] => 0.000741
[connect_time] => 0.104206
[pretransfer_time] => 0.104306
[size_upload] => 0
[size_download] => 49460
[speed_download] => 61856
[speed_upload] => 0
[download_content_length] => 49460
[upload_content_length] => 0
[starttransfer_time] => 0.280781
[redirect_time] => 0.400723
)
By default, Curl does not follow redirects and displays the content of the 300x page (if any). To follow redirects with Curl, you need to use the -L or --location command-line option.
php $url="http://libero-news.it.feedsportal.com/c/34068/f/618095/s/2e34796f/l/0L0Sliberoquotidiano0Bit0Cnews0C12735670CI0Esaggi0Eper0Ele0Eriforme0Ecostituzionali0EChiaccherano0Ee0Eascoltano0Bhtml/story01.htm"; $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_HEADER, true); curl_setopt($ ...
In HTTP, redirection is triggered by a server sending a special redirect response to a request. Redirect responses have status codes that start with 3 , and a Location header holding the URL to redirect to. When browsers receive a redirect, they immediately load the new URL provided in the Location header.
With libcurl, you can use the CURLINFO_REDIRECT_URL getinfo variable to find out the URL it would have redirected to if it was enabled. This allows programs to easily traverse the redirects themselves.
This approach is much better and easier than the parsing of Location:
headers the others have suggested here, as then your code must rebuild relative paths etc. CURLINFO_REDIRECT_URL
fixes that for you automatically.
The PHP/CURL binding added support for this feature in PHP 5.3.7:
$url = curl_getinfo($ch, CURLINFO_REDIRECT_URL)
The commit that fixed this:
https://github.com/php/php-src/commit/689268a0ba4259c8f199cae6343b3d17cab9b6a5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With