Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Facebook sometimes does not fetch Open Graph tags

I have an iOS app for a public library that shares links to Facebook. The links point to a single domain, which contains a relatively simple PHP script that redirects to three different destination domains based on the linked content (catalog items, calendar events, and user-generated lists). I have it set up like this because I'm using iOS universal links and I don't have control over all of the link destinations, so I need a central location for the apple-app-site-association file.

In this PHP script, I'm attempting to set OG tags dynamically based on the type of content that was shared. Here's the script:

<?php

$shareType = $_GET['t'];
$contentId = $_GET['id'];

$base_catalog_url='XXXXXXXXXXXX';
$base_list_url='XXXXXXXXXXXXX';
$base_event_url='XXXXXXXXXXXXXX';

if($shareType=='0'){
    $oclc;
    if(strlen($contentId)==8){
        $oclc = 'ocm'.$contentId;
    }

    if(strlen($contentId)==9){
        $oclc = 'ocn'.$contentId;
    }

    $url = $base_catalog_url.'searchCatalog?'.http_build_query(array('clientID' =>'sdIPhoneApp','term1'=>$oclc));
    $resp = simplexml_load_file($url);

    $pageTitle = $resp->HitlistTitleInfo->title;
    $isbn = $resp->HitlistTitleInfo->ISBN;
    $imageURL = 'http://www.syndetics.com/index.aspx?isbn='.$isbn.'/lc.gif&client=XXXXXXX';
    $redirectURL =  'XXXXXXXXXXXX'.$contentId;
    error_log($redirectURL);
    echo '<html>
        <head>
            <meta property="og:image" content="'.$imageURL.'" />
            <meta property="og:title" content="'.$pageTitle.'" />
            <meta name="twitter:card" content="summary" />
            <meta name="twitter:site" content="@acpl" />
            <meta name="twitter:title" content="'.$pageTitle.'" />
            <meta name="twitter:description" content="Allen County Public Library" />
            <meta name="twitter:image" content="'.$imageURL.'" />
            <meta http-equiv="refresh" content="0;URL='.$redirectURL.'">
        </head>
     </html>';
}

if($shareType=='1'){
    $url = $base_event_url.http_build_query(array('eventid' =>$contentId));
        $response = file_get_contents($url);
    $json = json_decode($response);
    $event = $json[0];
    $imageURL = $event->Image;
    $pageTitle = $event->Title;
    $description = $event->Description;

    if(strlen($imageURL)<5){
        $imageURL = 'https://XXXXXXXXX/appIcon200.png';
    }

    $redirectURL = 'XXXXXXXXXXX'.$contentId;

    echo '<html>
        <head>
            <meta property="og:image" content="'.$imageURL.'" />
            <meta property="og:title" content="'.$pageTitle.'" />
            <meta property="og:description" content="'.$description.'" />
            <meta name="twitter:card" content="summary" />
            <meta name="twitter:site" content="@acpl" />
            <meta name="twitter:title" content="'.$pageTitle.'" />
            <meta name="twitter:description" content="'.$description.'" />
            <meta name="twitter:text:description" content="'.$description.'" />
            <meta name="twitter:image" content="'.$imageURL.'" />
            <meta http-equiv="refresh" content="0;URL='.$redirectURL.'">
        </head>
     </html>';

}

if($shareType=='2'){

    $url = $base_list_url.http_build_query(array('listId' =>$contentId,'userKey'=>0));
    $response = file_get_contents($url);
    $json = json_decode($response);
    $imageURL = $json->coverImageURL;
    $pageTitle = $json->listName;
    $pageTitle = ucwords(strtolower($pageTitle));

    $redirectURL = "XXXXXXXXXXXX";

    echo '<html>
        <head>
            <meta property="og:image" content="'.$imageURL.'" />
            <meta property="og:title" content="'.$pageTitle.'" />
            <meta name="twitter:card" content="summary" />
            <meta name="twitter:site" content="@acpl" />
            <meta name="twitter:title" content="'.$pageTitle.'" />
            <meta name="twitter:description" content="Allen County Public Library" />
            <meta name="twitter:image" content="'.$imageURL.'" />
            <meta http-equiv="refresh" content="0;URL='.$redirectURL.'">
        </head>
     </html>';

}

?>

So, based on the type of content that was shared, I fetch a page title and image to provide in the OG tags. The redirection always works, regardless of whether Facebook pulls in the tags, but the tags are utilized only about half the time. You can see this in the iOS app. Tags pulled in successfully:

enter image description here

Tags not pulled in:

enter image description here

It seems to be random whether the tags are displayed for a given item. In the access logs on my server, when the tags are successfully displayed, I see a line like this:

66.220.158.119 - - [09/Sep/2016:09:54:50 -0400] "GET /share.php?t=1&id=76137 HTTP/1.1" 206 3771 "-" "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)"

However, when the tags are not displayed, there's nothing in the access log or the error log. This suggests that Facebook (or the Facebook component in iOS) is not even attempting to read the tags in these cases. Does this mean Facebook mistakenly thinks it has this data cached?

Another interesting tidbit is what happens when I try to debug one of these failed URLs on the Facebook sharing debugger (https://developers.facebook.com/tools/debug/). I'll get an error message along the lines of:

The 'og:image' property should be explicitly provided, even if a value can be inferred from other tags.

And when I click "See what our scraper sees for your URL." I get the response "The document returned no data".

The interesting thing is that when I click "Scrape again", it usually gives the same error for the first few times, then after 3 or 4 attempts it suddenly works and the tags are displayed. My first thought there is that this has to do with how I'm dynamically fetching the content for the tags, but as I noted above, in the cases where the tags aren't displayed, the access log shows that Facebook isn't even requesting anything from my server.

Thanks for your help; this has me pulling my hair out!

UPDATE: Here's an example URL if you'd like to try it out in the Facebook debugger if you'd like: https://amshare.acpl.lib.in.us/0_930144011

The number after the underscore is the OCLC number of the book, so you could plug in other values there. As I mentioned, after a few scrapes it usually starts working, then later fails to work again, etc.

like image 804
James Harpe Avatar asked Sep 09 '16 14:09

James Harpe


People also ask

Does Facebook use Open Graph?

Through the Open Graph protocol, you can tell Facebook how your shared content should be displayed on Facebook. This protocol tells Facebook which title, URL, image, and description of your blog post/page should be used when your content is shared on Facebook.

How does Facebook Open Graph work?

Open Graph is an internet protocol that was originally created by Facebook to standardize the use of metadata within a webpage to represent the content of a page. Within it, you can provide details as simple as the title of a page or as specific as the duration of a video.


1 Answers

I could be possible that facebook caches the share.php file and ignores the GET Vars.

you could try to rewrite the URL to a "pretty permalink". Put this in your htaccess file (if you have apache):

Options +FollowSymLinks
RewriteEngine On

RewriteRule ^share/(.*)/(.*)$ share.php?t=$1&id=$2 [L,NC]

this makes out of http://your-url.com/share/4/yeah this: http://your-url.com/?t=4&id=yeah

The $_GET var looks like this:

Array ( [t] => 4 [id] => yeah )

With this you could solve this problem (if it is REALLY caching). I had a lot of issues with the facebook scraper in the past. sometimes it ignores get vars and it caches like hell...

like image 83
Adrian Lambertz Avatar answered Nov 02 '22 03:11

Adrian Lambertz