Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

file_get_contents script works with some websites but not others

I'm looking to build a PHP script that parses HTML for particular tags. I've been using this code block, adapted from this tutorial:

<?php 
$data = file_get_contents('http://www.google.com');
$regex = '/<title>(.+?)</';
preg_match($regex,$data,$match);
var_dump($match); 
echo $match[1];
?>

The script works with some websites (like google, above), but when I try it with other websites (like, say, freshdirect), I get this error:

"Warning: file_get_contents(http://www.freshdirect.com) [function.file-get-contents]: failed to open stream: HTTP request failed!"

I've seen a bunch of great suggestions on StackOverflow, for example to enable extension=php_openssl.dll in php.ini. But (1) my version of php.ini didn't have extension=php_openssl.dll in it, and (2) when I added it to the extensions section and restarted the WAMP server, per this thread, still no success.

Would someone mind pointing me in the right direction? Thank you very much!

like image 440
Jeff Engler Avatar asked Feb 24 '23 09:02

Jeff Engler


1 Answers

It just requires a user-agent ("any" really, any string suffices):

file_get_contents("http://www.freshdirect.com",false,stream_context_create(
    array("http" => array("user_agent" => "any"))
));

See more options.

Of course, you can set user_agent in your ini:

 ini_set("user_agent","any");
 echo file_get_contents("http://www.freshdirect.com");

... but I prefer to be explicit for the next programmer working on it.

like image 111
Wrikken Avatar answered Mar 05 '23 19:03

Wrikken