file_get_contents('https://en.wikipedia.org/wiki/Category:Upcoming_singles');
returns a different response (2 products) from visiting the same address using the Chrome web browser (shows 4 products).
Upon inspection, I suspect this might be related to
Saved in parser cache key with ... timestamp ...
in the html returned. The timestamp is older when I use file_get_contents()
Any ideas on how to fetch the latest info using file_get_contents()
?
Thank you!
Assuming file_get_contents
is making an http request, it would be good to check the user agent specified.
I've heard of problems fetching data with some user agents. Take a look at this question.
You can specify other options (including the user agent) by using stream context:
<?php
$opts = array(
'http'=>array(
'method'=>"GET",
'header'=>"Accept-language: en\r\n" .
"Cookie: foo=bar\r\n"
)
);
$context = stream_context_create($opts);
// Open the file using the HTTP headers set above
$file = file_get_contents('http://www.example.com/', false, $context);
Take a look at the file_get_contents docs.
Also, as Jack said, cURL is a better option.
EDIT:
You get me wrong. What you've to add is a different user agent. For example, using the user agent from mozilla firefox get you the 4 results:
<?php
$opts = array(
'http'=>array(
'method'=>"GET",
'header'=>"Accept-language: en\r\n" .
"User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; es-AR; rv:1.9.2.23) Gecko/20110921 Ubuntu/10.10 (maverick) Firefox/3.6.23"
)
);
$context = stream_context_create($opts);
// Open the file using the HTTP headers set above
$file = file_get_contents('http://en.wikipedia.org/wiki/Category:Upcoming_singles', false, $context);
print $file;
But, i think it's not "legal", it's not good to cheat on that. I think there must be any other user agent that wikipedia provides to fetch its data from outside apps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With