Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Facebook graph extremely slow in PHP

Whether using the Facebook PHP SDK, or just loading data using curl with $contents = file_get_contents("https://graph.facebook.com/$id?access_token=$accessToken"), it takes around a whole second for the response to come.

That counts as very slow when I need to check the data for a bunch of ids.

When in a browser, if I type in a facebook graph url, I get the results almost instantly, under a tenth of the time it takes in PHP.

What is causing this problem, and how can I make it as fast as it would be in any browser? I know the browser can do it. There has to be a way to make it fast in PHP too.

IDEA: perhaps I need to configure something in cURL?

What I have tried:

  • Using the PHP SDK. It's as slow. The reason I tried using file_get_contents() in the first place was because I was hoping the PHP SDK wasn't configured properly.
  • Using setopt($ch, CURLOPT_SSL_VERIFYPEER, false);. It didn't make a difference. AFTER ANSWER ACCEPT EDIT: actually, this together with reusing the curl handle made the subsequent requests really fast.

EDIT: here is a pastebin of the code I used to measure the time it takes to do the requests: http://pastebin.com/bEbuqq5g. I corrected the text that used to say microseconds, to seconds. this is what produces results similar to the one I wrote in my comment in this question: Facebook graph extremely slow in PHP. Note also that they take similarly slow times even if the access token is expired, like in my pastebin example.

EDIT 2: there should be partly a problem with ssl. I tried benchmarking http://graph.facebook.com/4 (no httpS), and it resulted in 1.2 seconds for three requests, whereas the same, but with https took 2.2 seconds. This is in no way a solution though, because for any request that needs an access token, I must use https.

like image 824
Attila Szeremi Avatar asked Jul 10 '12 11:07

Attila Szeremi


4 Answers

file_get_contents can be very slow in PHP because it doesn't send/process headers properly, leading to the HTTP connection not getting closed properly when the file transfer is complete. I have also read about DNS issues, though I don't have any information about that.

The solution that I highly recommend is to either use the PHP SDK, which is designed for making API calls to Facebook, or make use of cURL (which the SDK uses). With cURL you can really configure a lot of aspects of the request, since it's basically designed for making API calls like this.

PHP SDK information: https://developers.facebook.com/docs/reference/php/

PHP SDK source: https://github.com/facebook/facebook-php-sdk

If you choose to do it without the SDK, you could look at how they make use of cURL in base_facebook.php. here is some sample code you could use to fetch using cURL:

function get_url($url)
{
   $ch = curl_init();
   curl_setopt($ch, CURLOPT_URL, $url); 
   curl_setopt($ch, CURLOPT_HEADER, FALSE);  // Return contents only
   curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);  // return results instead of outputting
   curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10) // Give up after connecting for 10 seconds 
   curl_setopt($ch, CURLOPT_TIMEOUT, 60);  // Only execute 60s at most
   curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);  // Don't verify SSL cert
   $response = curl_exec($ch);
   curl_close($ch);
   return $response;
}

$contents = get_url("https://graph.facebook.com/$id?access_token=$accessToken");

The function will return FALSE on failure.

I see that you said you've used the PHP SDK, but maybe you didn't have cURL set up. Try installing or updating it, and if it still seems to be slow, you should use

curl_setopt($ch, CURLOPT_HEADER, TRUE);
curl_setopt($ch, CURLOPT_VERBOSE, TRUE);

and check out the output.

like image 159
AndrewF Avatar answered Oct 17 '22 15:10

AndrewF


I wondered what would happen if I did two subsequent curl_exec() calls without doing a curl_close(), enabling the use of HTTP Keep-Alive.

The test code:

$ch = curl_init('https://graph.facebook.com/xxx');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

// FIRST REQUEST
curl_exec($ch);
print_r(curl_getinfo($ch));

// SECOND REQUEST
curl_exec($ch);
print_r(curl_getinfo($ch));

curl_close($ch);

Below are the results, showing parts of the output from curl_getinfo():

// FIRST REQUEST
[total_time] => 0.976259
[namelookup_time] => 0.008271
[connect_time] => 0.208543
[pretransfer_time] => 0.715296

// SECOND REQUEST
[total_time] => 0.253083
[namelookup_time] => 3.7E-5
[connect_time] => 3.7E-5
[pretransfer_time] => 3.9E-5

The first request is pretty slow, almost one whole second, similar to your experience. But from the time of the second request (only 0.25s) you can see how much difference the keep-alive made.

Your browser uses this technique as well of course, loading the page in a fresh instance of your browser would take considerably longer.

like image 42
Ja͢ck Avatar answered Oct 17 '22 15:10

Ja͢ck


Just two thoughts:

  1. Have you verified that the browser doesn't have a presistent connection to facebook? That the browser hasn't cached the DNS lookup (you could try adding graph.facebook.net to your hosts-file to rule in/out DNS)

  2. You are of course running the php code from the same system/environment as your browser (not from a vm, not from another host? Also that php is running with the same scheduling priorties as your browser? (same nice level etc))

like image 27
Eirik S Avatar answered Oct 17 '22 17:10

Eirik S


The overall biggest factor in making Graph API calls “slow” is – the HTTP connection.

Maybe there’s a little improvement in there by tweaking some parameters or getting a server with a better connection.

But this will most likely make no big difference, as HTTP is generally to be considered “slow”, and there’s little that can be done about this.

That counts as very slow when I need to check the data for a bunch of ids.

The best thing you can do to speed things up is, of course – minimize the number of HTTP requests.

If you have to do several Graph API calls in a row, try doing them as a Batch Request instead. That allows you to query several portions of data, while at the same time making only one HTTP request.

like image 27
CBroe Avatar answered Oct 17 '22 16:10

CBroe