Is there a better way to format text from Twitter to link the hyperlinks, username and hashtags? What I have is working but I know this could be done better. I am interested in alternative techniques. I am setting this up as a HTML Helper for ASP.NET MVC.
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
using System.Web;
using System.Web.Mvc;
namespace Acme.Mvc.Extensions
{
public static class MvcExtensions
{
const string ScreenNamePattern = @"@([A-Za-z0-9\-_&;]+)";
const string HashTagPattern = @"#([A-Za-z0-9\-_&;]+)";
const string HyperLinkPattern = @"(http://\S+)\s?";
public static string TweetText(this HtmlHelper helper, string text)
{
return FormatTweetText(text);
}
public static string FormatTweetText(string text)
{
string result = text;
if (result.Contains("http://"))
{
var links = new List<string>();
foreach (Match match in Regex.Matches(result, HyperLinkPattern))
{
var url = match.Groups[1].Value;
if (!links.Contains(url))
{
links.Add(url);
result = result.Replace(url, String.Format("<a href=\"{0}\">{0}</a>", url));
}
}
}
if (result.Contains("@"))
{
var names = new List<string>();
foreach (Match match in Regex.Matches(result, ScreenNamePattern))
{
var screenName = match.Groups[1].Value;
if (!names.Contains(screenName))
{
names.Add(screenName);
result = result.Replace("@" + screenName,
String.Format("<a href=\"http://twitter.com/{0}\">@{0}</a>", screenName));
}
}
}
if (result.Contains("#"))
{
var names = new List<string>();
foreach (Match match in Regex.Matches(result, HashTagPattern))
{
var hashTag = match.Groups[1].Value;
if (!names.Contains(hashTag))
{
names.Add(hashTag);
result = result.Replace("#" + hashTag,
String.Format("<a href=\"http://twitter.com/search?q={0}\">#{1}</a>",
HttpUtility.UrlEncode("#" + hashTag), hashTag));
}
}
}
return result;
}
}
}
Twitter does not directly provide support for formatting text in bold, italic, etc. But it does support Unicode characters [1], and so a hack to get around the formatting limitation is to replace letters with Unicode variants.
But in regards to whether or not spaces count as characters on Twitter, the answer is Yes. Spaces between words count towards the 280 character Twitter limit.
Type your Tweet (up to 280 characters) into the compose box at the top of your Home timeline, or select the Tweet button in the navigation bar. You can include up to 4 photos, a GIF, or a video in your Tweet. Select the Tweet button to post the Tweet to your profile.
That is remarkably similar to the code I wrote that displays my Twitter status on my blog. The only further things I do that I do are
1) looking up @name
and replacing it with <a href="http://twitter.com/name">Real Name</a>
;
2) multiple @name
's in a row get commas, if they don't have them;
3) Tweets that start with @name(s)
are formatted "To @name:".
I don't see any reason this can't be an effective way to parse a tweet - they are a very consistent format (good for regex) and in most situations the speed (milliseconds) is more than acceptable.
Edit:
Here is the code for my Tweet parser. It's a bit too long to put in a Stack Overflow answer. It takes a tweet like:
@user1 @user2 check out this cool link I got from @user3: http://url.com/page.htm#anchor #coollinks
And turns it into:
<span class="salutation">
To <a href="http://twitter.com/user1">Real Name</a>,
<a href="http://twitter.com/user2">Real Name</a>:
</span> check out this cool link I got from
<span class="salutation">
<a href="http://www.twitter.com/user3">Real Name</a>
</span>:
<a href="http://site.com/page.htm#anchor">http://site.com/...</a>
<a href="http://twitter.com/#search?q=%23coollinks">#coollinks</a>
It also wraps all that markup in a little JavaScript:
document.getElementById('twitter').innerHTML = '{markup}';
This is so the tweet fetcher can run asynchronously as a JS and if Twitter is down or slow it won't affect my site's page load time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With