Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

JavaScript URL regex splitting

I have a Regex that is able to detect URLs (Disclosure: I copied this Regex from the internet).

My goal is to split a string, so that I get an array of substrings that either are a full URL or not.

For example.

const detectUrls = // some magical Regex
const input = 'Here is a URL: https://google.com <- That was the URL to Google.';

console.log(input.split(detectUrls)); // This should output ['Here is a URL: ', 'https://google.com', ' <- That was the URL to Google.']

My current Regex solution is as follows: /(([a-z]+:\/\/)?(([a-z0-9\-]+\.)+([a-z]{2}|aero|arpa|biz|com|coop|edu|gov|info|int|jobs|mil|museum|name|nato|net|org|pro|travel|local|internal))(:[0-9]{1,5})?(\/[a-z0-9_\-.~]+)*(\/([a-z0-9_\-.]*)(\?[a-z0-9+_\-.%=&amp;]*)?)?(#[a-zA-Z0-9!$&'()*+.=-_~:@/?]*)?)(\s+|$)/gi;

However, when I run the example code with my regex, I get a useless answer:

[ 'Here is a URL: ', 
  'https://google.com', 
  'https://', 
  'google.com', 
  'google.', 
  'com', 
  undefined, 
  undefined, 
  undefined, 
  undefined, 
  undefined, 
  undefined, 
  ' ', 
  '<- That was the URL to Google.',
]

Would anyone be able to point me in the right direction? Thanks in advance.

like image 900
Kristian Sakarisson Avatar asked Apr 26 '26 00:04

Kristian Sakarisson


1 Answers

The reason why you are getting multiple matches is that the regex will return a match for each of your groups (the things inside parentheses).
For the result you want you should be using non capture groups (?:myRegex)
I modified your regex so that it should work:

/((?:[a-z]+:\/\/)?(?:(?:[a-z0-9\-]+\.)+(?:[a-z]{2}|aero|arpa|biz|com|coop|edu|gov|info|int|jobs|mil|museum|name|nato|net|org|pro|travel|local|internal))(?::[0-9]{1,5})?(?:\/[a-z0-9_\-.~]+)*(?:\/(?:[a-z0-9_\-.]*)(?:\?[a-z0-9+_\-.%=&amp;]*)?)?(?:#[a-zA-Z0-9!$&'(?:)*+.=-_~:@/?]*)?)(?:\s+|$)/

Tip: use an online website like https://regex101.com/ to test your regular expressions.
Also the answer for this question helped a bit:
Use of capture groups in String.split()

like image 127
szt Avatar answered Apr 30 '26 16:04

szt