Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I reverse a t.co URL to the originating Tweet?

I'm going through our site analytics, and have a load of t.co URLs which were referrers to a promotion we were doing. I'm trying to figure out if there is a way to reverse those back to the original tweet where they originated, through the Twitter API or other means. I can't seem to find a good means to do this though, is there one?

like image 930
Greg Hinch Avatar asked Dec 01 '12 13:12

Greg Hinch


People also ask

How do I find an original tweet that was embedded?

Click the http://t.co (Twitter) link to find the original URL. Go to About (#) Copy and paste the link into the on "search twitter" search box. You will see the tweet(s) with the link.

Is T Co a Twitter link?

The link service at http://t.co is only used on links posted on Twitter and is not available as a general shortening service on other apps or sites.

How do I shorten a T Co link on Twitter?

How Do I Use Twitter Link Shortener? You use t.co whenever you post Twitter links. Simply copy and paste a URL into a tweet, reply, or DM, and t.co will automatically convert it into a link of 23 characters or less. This helps keep your tweets under the character limit so you can add more to your post.


1 Answers

When a t.co forward points to a tweet, it goes to the web page for that tweet and the HTML for the page will include the canonical URL.

The ugly way to get this information is to use wget or curl to grab the HTML destination which will include the URL for your initial tweet.

A better way to do it is with the Python module, Requests (you will need to install this module first). Here's a quick command line script that will do it:

#!/usr/bin/env python

import requests

shorturl = raw_input("Enter the shortened URL in its entirety: ")
r = requests.get(shorturl)

print("""
The shortened URL forwards to:

    %s
""" % r.url)

That code will work on any of those URL shortening services, not just Twitter's t.co site.

I did my testing with Python 2.7, but chances are that the above code will work with Python 3.x. Either way, Requests is your friend, see the documentation for details:

http://docs.python-requests.org/en/latest/index.html

The redirection and history section covers this example.

I don't know of a way to do it through the Twitter API and it may not be possible if all URL shortening is automatic. Still an API based solution would only work with the t.co addresses, whereas the code above will work on any other shortened URL or any URL which redirects (e.g. HTTP 301 or 302 response codes) to another location.

Edit (better a bit later than never): After using the above to find where the t.co forward actually points to, there will be three or four types of possible results. The most common being that it is what the OP believes they all are, a shortening to a URL pasted into a tweet and, to be fair, that is what most of them are.

The other possibilities are that it links back to the tweet itself, this usually only appears with some rather long tweets (not sure how much that increases in frequency with the character limit increase too); as well as forwarding to the URL of a status independent of a the tweet author's status URL, which is often the case with embedded media (images and video); plus forwards to the URL of a tweet which is being quote tweeted or retweeted.

Given the OP's original scenario, none of those internal Twitter usages should ever be seen and only the "normal" forwarding is of concern here. Now searching for the t.co address at twitter.com avails us nothing, regardless what combinations are used.

Searching the target address, however, that which is revealed by scripts like the one at the start of this answer, however, is quite another matter. That will produce the the results of every tweet which is publicly accessible and which posted that link. There are, however, some drawbacks including:

  1. The search results will include tweets where other forwarding services were used as well.
  2. There is no way to tell whether all the tweets which linked to that URL generated the same t.co address or not.
  3. If not, there is no way to see which t.co forward was utilised by which tweet.

Nevertheless, in conjunction with complete referrer logs on a web server, it may be possible to narrow that further. Assuming the referrer URL reports the URL of the tweet and not simply twitter.com. That, however, is more likely to be determined by the manner in which the person clicking on the link did so (i.e. were they just seeing the tweet in a stream or had they expanded it enough to display its full URL).

I suspect the effectiveness of referrer logs will be sporadic and likely reduced on smartphones and tablets where the apps in use are less likely to have expanded tweets in that way in order to then provide that data to third party websites.

#!/usr/bin/env python3

import requests
import urllib.parse

shorturl = input("Enter the shortened URL in its entirety: ")
r0 = requests.get(shorturl, verify=True)
t0 = "https://twitter.com/search?f=tweets&q="
t1 = urllib.parse.quote_plus(r0.url)
r1 = requests.get("{0}{1}".format(t0, t1), verify=True)

# the results will be in r1.content
# there may be some benefit from cutting the http:// or 
# https:// from r0.url before creating the quoted string in t1.

That, however, is as good as it gets ... without paying Twitter for enhanced data access.

like image 82
Ben Avatar answered Oct 12 '22 03:10

Ben