Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Detect and remove URLs from textarea

Tags:

jquery

regex

<textarea name="test">
  http://google.com/
  https://google.com/
  www.google.com/
  [url=http://google.com/]google.com[/url]
  text
</textarea>

My current attempt at checking if there is a URL in the textarea.

if ($('textarea[name="test"]').val().indexOf('[url') >= 0 ||
    $('textarea[name="test"]').val().match(/^http([s]?):\/\/.*/) ||
    $('textarea[name="test"]').val().match(/^www.[0-9a-zA-Z',-]./)) {

This doesn't seem to work completely for checking any of the URLs above - I'm wondering how it can be optimized. It seems very sloppy and hacked together at the moment and hopefully someone can shed some insight.

My current attempt at removing URLs from the textarea:

var value = $('textarea[name="test"]').val();
    value = value.replace(/\[\/?url([^\]]+)?\]/g, '');
$('textarea[name="test"]').val(value);

Right now, it will output:

<textarea>
  http://google.com/
  https://google.com/
  www.google.com/
  google.com
  text
</textarea>

What I'd like my output to be:

<textarea>
  text
</textarea>
like image 244
O P Avatar asked Feb 23 '13 19:02

O P


3 Answers

Try (Corrected and improved after comments):

value = value.replace(/^(\[url=)?(https?:\/\/)?(www\.|\S+?\.)(\S+?\.)?\S+$\s*/mg, '');

Peeling the expression from end to start:

  • An address might have two or three 'parts', besides the scheme
  • An address might start with www or not
  • It my be preceeded by http:// or https://
  • It may be enclosed inside [url=...]...[/url]

This expression does not enforce the full correct syntax, that is a much tougher regex to write.
A few improvements you might want:

1.Awareness of spaces

value = value.replace(/^\s*(\[\s*url\s*=\s*)?(https?:\/\/)?(www\.|\S+?\.)(\S+?\.)?\S+\s*$\s*/mg, '');

2.Enforce no dots on the last part

value = value.replace(/^(\[url=)?(https?:\/\/)?(www\.|\S+?\.)(\S+?\.)?[^.\s]+$\s*/mg, '');
like image 176
ilomambo Avatar answered Nov 08 '22 05:11

ilomambo


Regarding your attempt at checking if there is a URL in the textarea.

if ($('textarea[name="test"]').val().indexOf('[url') >= 0 ||
    $('textarea[name="test"]').val().match(/^http([s]?):\/\/.*/) ||
    $('textarea[name="test"]').val().match(/^www.[0-9a-zA-Z',-]./)) {

Firstly, rather than getting the textarea value three times using multiple function calls it would better to store it in a variable before the checking, i.e.

var value = $('textarea[name="test"]').val();

The /^http([s]?):\/\/.*/, because of the ^ will only match if the "http://..." is found right at the beginning of the textarea value. The same applies to the ^www.. Adding the multiline flag m to the end of the regex would make ^ match the start of each line, rather than just the start of the string.

The .* in /^http([s]?):\/\/.*/ serves no purpose as it matches zero or more characters. The ([s]?) is better as s?.

In /^www.[0-9a-zA-Z',-]./, the . needs to be escaped to match a literal . if that is your intention, i.e. \., and I assume you mean to match more than one of the characters in the character class so you need to follow it with +.

It is more efficient to use the RegExp test method rather than match when the actual matches are not required, so, combining the above, you could have

if ( /^(\[url|https?:\/\/|www\.)/m.test( value ) ) {

There is little point in the check anyway if you are only using it to decide whether you need to call replace, because the check is implicit in the replace call itself

Using the simple criteria that strings of non-space characters at the start of a line and beginning with http[s]://, [url or www., should be removed, you could use

value = value.replace( /^(?:https?:\/\/|\[url|www\.)\S+\s*/gm, '' );

If the urls can appear anywhere you could use \b, meaning word boundary, instead of ^, and remove the m flag.

value = value.replace( /(?:\bhttps?:\/\/|\bwww\.|\[url)\S+\s*/g, '' );

It would be a waste of effort to try to offer a better regex solution without precise details of what forms of url may appear in the textarea, where they may appear and what characters may adjoin them.

If any valid url can appear anywhere in the textarea and be surrounded by any other characters than there is no watertight solution.

like image 21
MikeM Avatar answered Nov 08 '22 05:11

MikeM


The below JQuery code will do the job

<script>
// disable links in textarea and add class linkdisable in textarea
jQuery('.linkdisable').focusout(function(e){
  var message = jQuery('.linkdisable').val();
   if(/(http|https|ftp):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/.*)?$/i.test($(this).val())){
      alert('Links Not Allowed');
      e.preventDefault();
    }
   else if (/^[a-zA-Z0-9\-\.]+\.(com|org|net|mil|edu|COM|ORG|NET|MIL|EDU)$/i.test($(this).val())) {
     alert('Links Not Allowed');
      e.preventDefault();
  }
});
</script>
like image 1
mayurdarji Avatar answered Nov 08 '22 07:11

mayurdarji