My regex is poor and letting me down so some help would be great here.
All I want to do is return all the links which appear in a tweet (just a string) - Some examples are:
"Great summary http://mytest.com/blog/post.html (#test)
"http://mytest.com/blog/post.html (#test)
"post: http://mytest.com/blog/post.html"
It should also support multiple links like:
"read http://mytest.com/blog/post.html and http://mytest.com/blog/post_two.html"
Any help would be great!
Thanks
Ben
Try this one:
/\bhttps?:\/\/\S+\b/
Update:
To catch links beginning with "www." too (no "http://" prefix), you could try this:
/\b(?:https?:\/\/|www\.)\S+\b/
Here's a code snippet from a site I wrote that parses a twitter feed. It parses links, hash tags, and twitter usernames. So far it's worked fine. I know it's not Ruby, but the regex should be helpful.
if(tweetStream[i] != null)
{
var str = tweetStream[i].Text;
var re = new Regex(@"http(s)?:\/\/\S+");
MatchCollection mc = re.Matches(tweetStream[i].Text);
foreach (Match m in mc)
{
str = str.Replace(m.Value, "<a href='" + m.Value + "' target='_blank'>" + m.Value + "</a>");
}
re = new Regex(@"(@)(\w+)");
mc = re.Matches(tweetStream[i].Text);
foreach (Match m in mc)
{
str = str.Replace(m.Value, "<a href='http://twitter.com/" + m.Value.Replace("@",string.Empty) + "' target='_blank'>" + m.Value + "</a>");
}
re = new Regex(@"(#)(\w+)");
mc = re.Matches(tweetStream[i].Text);
foreach (Match m in mc)
{
str = str.Replace(m.Value, "<a href='http://twitter.com/#search?q=" + m.Value.Replace("#", "%23") + "' target='_blank'>" + m.Value + "</a>");
}
tweets += string1 + "<div>" + str + "</div>" + string2;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With