Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I optimize my FQL to avoid Facebook timeouts?

Let's take a simple FQL query to get all links shared by a user's friends since yesterday for example:

SELECT link_id, title, url, owner, created_time
FROM link
WHERE
    created_time > strtotime('yesterday') AND
    owner IN (
        SELECT uid2 FROM friend WHERE uid1 = me()
    )
LIMIT 100

If a user has 50 friends, this will execute perfectly. But if a user has hundreds of friends, more often than not Facebook returns an error.

Options:

  1. Limit the friend select query to 50 -- Sure, that will work, but it will show the same friends every time. Unless you want an Ivy League-only feed, this isn't very helpful.
  2. Batch Queries -- Create a batch of queries using offsets and limit each to 50. Unfortunately there's no improvement here either.
  3. Loop It -- So far this is the best I've found. Loop through the same queries you built for a batch query, but do it one at a time with multiple api fql query calls. But even this is hit and miss.

How can I query Facebook appropriately to ensure successful results?

Notes:

  • I'm using the latest Facebook php sdk, 3.1.1
  • I've also tried extending the default options for curl timeouts in the base_facebook.php

Common Errors associated with Timeout:

1.

Fatal error:  Uncaught Exception: 1: An unknown error occurred thrown in /..../facebook/php-sdk/src/base_facebook.php on line 708

line 708 is an exception error:

// results are returned, errors are thrown
if (is_array($result) && isset($result['error_code'])) { 
    throw new FacebookApiException($result);
}

2.

Fatal error: Uncaught CurlException: 52: SSL read: error:00000000:lib(0):func(0):reason(0), errno 104 thrown in /..../facebook/php-sdk/src/base_facebook.php on line 814
like image 644
Ryan Avatar asked Aug 30 '11 19:08

Ryan


2 Answers

You should loop through using limit/offset like you said, or cache the friends list up front as puffpio suggested.

You said that it still wasn't working reliably - this is because some users may have many, many links, while others not so many. Note also that you may be retrieving uncached data for some users. I would recommend having a single retry in your loop for failed queries - it's often the case that the first one will time out and the second one will succeed due to newly cached data.

Finally, for posterity, I'm opening a task to optimize the link table to do a better job of being efficient when it's being filtered by time.

like image 51
Drew Hoskins Avatar answered Nov 05 '22 20:11

Drew Hoskins


Some db engines do not optimize the IN keyword well, or at all. They may be executing the in clause for every single resulting row of your query. Can you join the link and friend tables instead of using an IN with a subquery?

You may find this article interesting. (Discusses issues with IN clause performance on MySQL and Facebook runs MySQL on the back end.)

like image 25
Paul Sasik Avatar answered Nov 05 '22 18:11

Paul Sasik