Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to replace URLs not within a href tag in javascript

I have a situation where I have text which contains URL links. The links are in 2 forms

  1. www.stackoverflow.com
  2. <a href="http://www.stackoverflow.com">Stack over flow</a>

I am trying to create a simple function that uses regex that will wrap all links of type 1 with A HREF tag but leaving the other ones already wrapped a lone.

I have something like this but not successful.

function replaceURLWithHTMLLinks(text) {
    var exp = /(<(\s*)a(\s)*href.*>.*<\/(\s)*a(\s*)>)/ig;
    var matches = exp.exec(text);
    for(var i=0; i < matches.length; i++) {
        var line = matches[i];
        if(!exp.test(line)) {
            var exp2 = /(\b(?:(?:(?:https?|ftp|file):\/\/|www\.|ftp\.)[-A-Z0-9+&@#\/%?=~_|$!:,.;]*[-A-Z0-9+&@#\/%=~_|$])|”(?:(?:https?|ftp|file):\/\/|www\.|ftp\.)[^"\r\n]+”?|’(?:(?:https?|ftp|file):\/\/|www\.|ftp\.)[^'\r\n]+’?)/ig;
            text = text.replace("http://","");
                text = text.replace(exp2, "<a href=http://$1>$1</a>");
        }
    }

    return text;
}

It's not working but hoping someone could fix it :)

EDIT

The solution that fixed it, with the help of @MikeM answer

function replaceLinksSO(text) {
    rex = /(<a href=")?(?:https?:\/\/)?(?:(?:www)[-A-Za-z0-9+&@#\/%?=~_|$!:,.;]+\.)+[-A-Za-z0-9+&@#\/%?=~_|$!:,.;]+/ig;   
    return text.replace(rex, function ( $0, $1 ) {
        if(/^https?:\/\/.+/i.test($0)) {
            return $1 ? $0: '<a href="'+$0+'">'+$0+'</a>';
        }
        else {
            return $1 ? $0: '<a href="http://'+$0+'">'+$0+'</a>';
        }
    });
}
like image 648
george_h Avatar asked Feb 21 '13 09:02

george_h


2 Answers

Without trying to analyze the complex regex and function above, here is an example implementation using a toy url matching pattern to illustrate a method of making such replacements

var str = ' www.stackoverflow.com  <a href="http://www.somesite.com">somesite</a> www.othersite.org '
    rex = /(<a href=")?(?:https?:\/\/)?(?:\w+\.)+\w+/g;    

str = str.replace( rex, function ( $0, $1 ) {
    return $1 ? $0 : '<a href="' + $0 + '">' + $0 + '</a>';
});

You can alter the url matching pattern and insert e.g. \s* as required.

like image 82
MikeM Avatar answered Sep 23 '22 15:09

MikeM


Replacing patterns matching /(https?:\/\/)?((?:www|ftp)\.[-A-Za-z0-9+&@#\/%?=~_|$!:,.;]+?)[\r\n\s]+/ with <a href="$1$2">$1</a> would meet your requirement.

A better regex to match with will be ^(?!href="[^"\n\r\s]+?").*?(https?:\/\/)?((?:www|ftp)\.[-A-Za-z0-9+&@#\/%?=~_|$!:,.;]+)$

like image 29
Naveed S Avatar answered Sep 23 '22 15:09

Naveed S