Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Formatting Twitter text (TweetText) with C#

Is there a better way to format text from Twitter to link the hyperlinks, username and hashtags? What I have is working but I know this could be done better. I am interested in alternative techniques. I am setting this up as a HTML Helper for ASP.NET MVC.

using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
using System.Web;
using System.Web.Mvc;

namespace Acme.Mvc.Extensions
{

    public static class MvcExtensions
    {
        const string ScreenNamePattern = @"@([A-Za-z0-9\-_&;]+)";
        const string HashTagPattern = @"#([A-Za-z0-9\-_&;]+)";
        const string HyperLinkPattern = @"(http://\S+)\s?";

        public static string TweetText(this HtmlHelper helper, string text)
        {
            return FormatTweetText(text);
        }

        public static string FormatTweetText(string text)
        {
            string result = text;

            if (result.Contains("http://"))
            {
                var links = new List<string>();
                foreach (Match match in Regex.Matches(result, HyperLinkPattern))
                {
                    var url = match.Groups[1].Value;
                    if (!links.Contains(url))
                    {
                        links.Add(url);
                        result = result.Replace(url, String.Format("<a href=\"{0}\">{0}</a>", url));
                    }
                }
            }

            if (result.Contains("@"))
            {
                var names = new List<string>();
                foreach (Match match in Regex.Matches(result, ScreenNamePattern))
                {
                    var screenName = match.Groups[1].Value;
                    if (!names.Contains(screenName))
                    {
                        names.Add(screenName);
                        result = result.Replace("@" + screenName,
                           String.Format("<a href=\"http://twitter.com/{0}\">@{0}</a>", screenName));
                    }
                }
            }

            if (result.Contains("#"))
            {
                var names = new List<string>();
                foreach (Match match in Regex.Matches(result, HashTagPattern))
                {
                    var hashTag = match.Groups[1].Value;
                    if (!names.Contains(hashTag))
                    {
                        names.Add(hashTag);
                        result = result.Replace("#" + hashTag,
                           String.Format("<a href=\"http://twitter.com/search?q={0}\">#{1}</a>",
                           HttpUtility.UrlEncode("#" + hashTag), hashTag));
                    }
                }
            }

            return result;
        }

    }

}
like image 366
Brennan Avatar asked Jul 27 '09 01:07

Brennan


People also ask

Can you format text on Twitter?

Twitter does not directly provide support for formatting text in bold, italic, etc. But it does support Unicode characters [1], and so a hack to get around the formatting limitation is to replace letters with Unicode variants.

Is Twitter 280 characters with or without spaces?

But in regards to whether or not spaces count as characters on Twitter, the answer is Yes. Spaces between words count towards the 280 character Twitter limit.

How do you write a Twitter format?

Type your Tweet (up to 280 characters) into the compose box at the top of your Home timeline, or select the Tweet button in the navigation bar. You can include up to 4 photos, a GIF, or a video in your Tweet. Select the Tweet button to post the Tweet to your profile.


1 Answers

That is remarkably similar to the code I wrote that displays my Twitter status on my blog. The only further things I do that I do are

1) looking up @name and replacing it with <a href="http://twitter.com/name">Real Name</a>;

2) multiple @name's in a row get commas, if they don't have them;

3) Tweets that start with @name(s) are formatted "To @name:".

I don't see any reason this can't be an effective way to parse a tweet - they are a very consistent format (good for regex) and in most situations the speed (milliseconds) is more than acceptable.

Edit:

Here is the code for my Tweet parser. It's a bit too long to put in a Stack Overflow answer. It takes a tweet like:

@user1 @user2 check out this cool link I got from @user3: http://url.com/page.htm#anchor #coollinks

And turns it into:

<span class="salutation">
    To <a href="http://twitter.com/user1">Real Name</a>,
    <a href="http://twitter.com/user2">Real Name</a>:
</span> check out this cool link I got from
<span class="salutation">
    <a href="http://www.twitter.com/user3">Real Name</a>
</span>:
<a href="http://site.com/page.htm#anchor">http://site.com/...</a>
<a href="http://twitter.com/#search?q=%23coollinks">#coollinks</a>

It also wraps all that markup in a little JavaScript:

document.getElementById('twitter').innerHTML = '{markup}';

This is so the tweet fetcher can run asynchronously as a JS and if Twitter is down or slow it won't affect my site's page load time.

like image 83
Rex M Avatar answered Oct 03 '22 13:10

Rex M