Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP: How to use the Twitter API's data to convert URLs, mentions, and hastags in tweets to links?

Tags:

php

twitter

api

I'm really stumped on how Twitter expects users of its API to convert the plaintext tweets it sends to properly linked HTML.

Here's the deal: Twitter's JSON API sends this set of information back when you request the detailed data for a tweet:

{
    "created_at":"Wed Jul 18 01:03:31 +0000 2012",
    "id":225395341250412544,
    "id_str":"225395341250412544",
    "text":"This is a test tweet. #boring @nbc http://t.co/LUfDreY6 #skronk @crux http://t.co/VpuMlaDs @twitter",
    "source":"web",
    "truncated":false,
    "in_reply_to_status_id":null,
    "in_reply_to_status_id_str":null,
    "in_reply_to_user_id":null,
    "in_reply_to_user_id_str":null,
    "in_reply_to_screen_name":null,
    "user": <REDACTED>,
    "geo":null,
    "coordinates":null,
    "place":null,
    "contributors":null,
    "retweet_count":0,
    "entities":{
        "hashtags":[
            {
                "text":"boring",
                "indices":[22,29]
            },
            {
                "text":"skronk",
                "indices":[56,63]
            }
        ],
        "urls":[
            {
                "url":"http://t.co/LUfDreY6",
                "expanded_url":"http://www.twitter.com",
                "display_url":"twitter.com",
                "indices":[35,55]
            },
            {
                "url":"http://t.co/VpuMlaDs",
                "expanded_url":"http://www.example.com",
                "display_url":"example.com",
                "indices":[70,90]
            }
        ],
        "user_mentions":[
            {
                "screen_name":"nbc",
                "name":"NBC",
                "id":26585095,
                "id_str":"26585095",
                "indices":[30,34]
            },
            {
                "screen_name":"crux",
                "name":"Z. D. Smith",
                "id":407213,
                "id_str":"407213",
                "indices":[64,69]
            },
            {
                "screen_name":"twitter",
                "name":"Twitter",
                "id":783214,
                "id_str":"783214",
                "indices":[91,99]
            }
        ]
    },
    "favorited":false,
    "retweeted":false,
    "possibly_sensitive":false
}

The interesting parts, for this question, are the text element and the entries in the hashtags, user_mentions, and urls arrays. Twitter is telling us where in the text element the hastags, mentions, and urls appear with the indices arrays... so here's the crux of the question:

How do you use those indices arrays?

You can't just use them straight up by looping over each link element with something like substr_replace, since replacing the first link element in the text will invalidate all the index values for subsequent link elements. You also can't use substr_replace's array functionality, since it only works when you give it an array of strings for the first arg, rather than a single string (I've tested this. The results are... strange).

Is there some function that can simultaneously replace multiple index-delimited substrings in a single string with different replacement strings?

like image 775
coredumperror Avatar asked Jul 18 '12 01:07

coredumperror


People also ask

Can I Tweet using Twitter API?

While not rate limited by the API, a user is limited in the number of Tweets they can create at a time. If the number of updates posted by the user reaches the current allowed limit this method will return an HTTP 403 error.

How do I extract a link from Twitter?

Navigate to the Tweet you'd like the URL of. Click the icon located within the Tweet. From the pop-up menu, select Copy link to Tweet. The URL should now be copied to your clipboard.


1 Answers

All you have to do to use the indices twitter provides straight up with a simple replace is collect the replacements you want to make and then sort them backwards. You can probably find a more clever way to build $entities, I wanted them optional anyway, so I KISS as far as that went.

Either way, my point here was just to show that you don't need to explode the string and character count and whatnot. Regardless of how you do it, all you need to to is start at the end and work to the beginning of the string, and the index twitter has is still valid.

<?php 

function json_tweet_text_to_HTML($tweet, $links=true, $users=true, $hashtags=true)
{
    $return = $tweet->text;

    $entities = array();

    if($links && is_array($tweet->entities->urls))
    {
        foreach($tweet->entities->urls as $e)
        {
            $temp["start"] = $e->indices[0];
            $temp["end"] = $e->indices[1];
            $temp["replacement"] = "<a href='".$e->expanded_url."' target='_blank'>".$e->display_url."</a>";
            $entities[] = $temp;
        }
    }
    if($users && is_array($tweet->entities->user_mentions))
    {
        foreach($tweet->entities->user_mentions as $e)
        {
            $temp["start"] = $e->indices[0];
            $temp["end"] = $e->indices[1];
            $temp["replacement"] = "<a href='https://twitter.com/".$e->screen_name."' target='_blank'>@".$e->screen_name."</a>";
            $entities[] = $temp;
        }
    }
    if($hashtags && is_array($tweet->entities->hashtags))
    {
        foreach($tweet->entities->hashtags as $e)
        {
            $temp["start"] = $e->indices[0];
            $temp["end"] = $e->indices[1];
            $temp["replacement"] = "<a href='https://twitter.com/hashtag/".$e->text."?src=hash' target='_blank'>#".$e->text."</a>";
            $entities[] = $temp;
        }
    }

    usort($entities, function($a,$b){return($b["start"]-$a["start"]);});


    foreach($entities as $item)
    {
        $return = substr_replace($return, $item["replacement"], $item["start"], $item["end"] - $item["start"]);
    }

    return($return);
}


?>
like image 98
vita10gy Avatar answered Oct 06 '22 01:10

vita10gy