Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Match string unless starts and ends with character

I'm trying to use a regex in JavaScript to decide if a message gets deleted. I want to delete the message if it contains "string" anywhere, unless it's surrounded by colons.

  • string - gets deleted
  • blah string - gets deleted
  • :string blah - gets deleted
  • :string: string - gets deleted
  • thing :string: - doesn't get deleted

I'm using JavaScript, and so far I'm using message.match(/string/i) to see if the message gets deleted. I've tried a negative lookahead, but I probably used it wrong.

EDIT: Sorry for not including this earlier, but :blahstring: and :stringblah: and :blahstringblah: should not be deleted as well.

like image 309
Kognise Avatar asked Dec 23 '22 01:12

Kognise


2 Answers

There are some boundary cases where the colon appears only at one side of "string". Therefore I believe it is easier to remove all occurrences of ":string:" and only then look for a match of "string":

function deleteIt(msg) {
    return /string/i.test(msg.replace(/:\w*string\w*(?=:)/ig, ":"));
}

console.log(deleteIt("this is :string ")); // true
console.log(deleteIt("this is string: ")); // true
console.log(deleteIt("string:string: ")); // true
console.log(deleteIt("this is :string: ")); // false
console.log(deleteIt("this is :blastring:stringbla:string: ")); // false

The last test in the above snippet is a special case. The colon is "shared" by a preceding and following "string". Depending on whether you want such "string" occurrences to be ignored or not, you may need to replace the look-ahead with a normal capture of the second colon.

Addendum

In your edit to the question, you say that ":blastring:" or ":stringbla:" should also not trigger a deletion.

So I added \w* twice in the regex above to align with that extra requirement.

If also punctuation or other non-alphabetical characters could be allowed between the colon and "string", like ":,-°string^0&:", just not white-space, then use \S* instead of \w*.

like image 166
trincot Avatar answered Dec 25 '22 15:12

trincot


If lookbehind is supported you may use

/(?<!:(?=string:))string/i

See the regex demo

Details

  • (?<!:(?=string:)) - a negative lookbehind that fails the match if, immediately to the left of the current location, there is : that is not immediately followed with string:
  • string - a string

var strs = ['string - gets deleted','blah string - gets deleted',':string blah - gets deleted',':string: string - gets deleted','thing :string: - doesnt get deleted'];
var rx = /(?<!:(?=string:))string/i;
for (var s of strs) {
  console.log(s, "=>", rx.test(s));
}

Output:

string - gets deleted => true
blah string - gets deleted => true
:string blah - gets deleted => true
:string: string - gets deleted => true
thing :string: - doesnt get deleted => false

A version without lookbehind

It is based on a regex that matches string either without colons or with colons on both sides. If the matches contain at least one match with no colon at the start, the entry must be deleted.

var strs = ['string - gets deleted','blah string - gets deleted',':string blah - gets deleted',':string: string - gets deleted','thing :string: - doesnt get deleted'];
var rx = /(?::(?=string:))?string/gi;
for (var s of strs) {
  var matches = s.match(rx);
  console.log(s, "=>", (matches.some(function (x) { return !/^:/.test(x); }) ));
}
like image 36
Wiktor Stribiżew Avatar answered Dec 25 '22 14:12

Wiktor Stribiżew