I'm trying to download the contents of a web page using PHP. When I issue the command:
$f = file_get_contents("http://mobile.mybustracker.co.uk/mobile.php?searchMode=2");
It returns a page that reports that the server is down. Yet when I paste the same URL into my browser I get the expected page.
Does anyone have any idea what's causing this? Does file_get_contents transmit any headers that differentiate it from a browser request?
The file_get_contents() reads a file into a string. This function is the preferred way to read the contents of a file into a string.
curl supports HTTPS certificates, HTTP POST, HTTP PUT, FTP uploading HTTP form based upload, proxies, cookies. Curl is a much faster alternative to file_get_contents.
it takes anywhere from 30-90 seconds to process. It's not limited to our server, it is slow when accessing any external url, such as http://www.google.com. I believe the script calls the full url because there are query string variables that are necessary that don't work if you call the file locally.
Yes, there are differences -- the browser tends to send plenty of additionnal HTTP headers, I'd say ; and the ones that are sent by both probably don't have the same value.
Here, after doing a couple of tests, it seems that passing the HTTP header called Accept
is necessary.
This can be done using the third parameter of file_get_contents
, to specify additionnal context informations :
$opts = array('http' =>
array(
'method' => 'GET',
//'user_agent ' => "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2) Gecko/20100301 Ubuntu/9.10 (karmic) Firefox/3.6",
'header' => array(
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*\/*;q=0.8
'
),
)
);
$context = stream_context_create($opts);
$f = file_get_contents("http://mobile.mybustracker.co.uk/mobile.php?searchMode=2", false, $context);
echo $f;
With this, I'm able to get the HTML code of the page.
Notes :
User-Agent
, but it doesn't seem to be necessary -- which is why the corresponding line is here as a comment
Accept
header is the one Firefox used when I requested that page with Firefox before trying with file_get_contents
.
For more informations, you can take a look at :
file_get_contents
stream_context_create
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With