Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to exclude a whole word including an escaped character from a search?

I'm trying to find a way to match a single character except if it's inside a specific word.

The problem is I'm trying to find all the '/', but not the ones inside 'TCP/IP'. I found that a Negative Lookahead would do the job, but the problem is to exclude the whole 'TCP/IP' word. When I escape the '/', it makes the negative lookahead wrong.

The tested Regex is:

(?!TCP\/IP)\/

The data to test:

PHP/JAVA/TCP/IP/PYTHON/JAVASCRIPT

It should match every '/', except the one inside 'TCP/IP'

However, when I'm testing the regex with regex101.com, my negative lookahead part goes numb as I add the /:

Negative Lookahead (?!TCP\/IP)
Assert that the Regex below does not match
TCP matches the characters TCP literally (case insensitive)
\/ matches the character / literally (case insensitive)
IP matches the characters IP literally (case insensitive)

It seems like it's not considered as a single word anymore.

I think it can be fixed easily, but I'm out of solution at the moment.

Thanks.

like image 242
Barzou Avatar asked Nov 23 '25 13:11

Barzou


1 Answers

Instead of matching the slashes to split you could also use the "reverse" regex to find all the matches.

const string = "PHP/JAVA/TCP/IP/PYTHON/JAVASCRIPT";
const regex = /(TCP\/IP)(?=\/|$)|[^/]+/g;
//             ^       ^
// The group is unnecessary here, but is required in my second example.

console.log(string.match(regex));

If you've more exceptions you can make this dynamic by doing the following:

const string = "PHP/JAVA/TCP/IP/PYTHON/JAVASCRIPT/AB/CDE/FOO/UDP/TCP/AB/CD";
const exceptions = ["TCP/IP", "AB/CD", "AB/CDE", "UDP/TCP"];

// https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions#Escaping
function escapeRegExp(string) {
  return string.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
}

let pattern = exceptions.map(escapeRegExp).join('|');
    pattern = `(${pattern})(?=/|$)|[^/]+`;
const regex = new RegExp(pattern, "g");

console.log(string.match(regex));

Let me give you a short rundown of what this does.

  1. First escape all regex special characters inside the exceptions array.
  2. Join them together with the | character (regex OR).
  3. Now for the regex itself. Match one of the exceptions. The match must be followed by either an / character or the end of the string ($). If none of the the exceptions match, check if the character is a non-/. If this is the case match as many non-/ as possible.

Note: If you for some reason have the exceptions A/B and A/B/C, you should rearrange the array so that A/B/C comes before A/B. If this is not done you get the matches ["A/B", "C"] for the string "A/B/C" due to the fact that the A/B is indeed followed by a forward slash. Sorting the array based on string length (largest first) resolves this.

like image 111
3limin4t0r Avatar answered Nov 26 '25 02:11

3limin4t0r



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!