Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

JavaScript split on char but ignoring double escaped chars

Tags:

javascript

I'm trying to do something similar to this but cant get it working.

How to split a comma separated String while ignoring escaped commas?

I have tried to figure it out but cant seem to get it right.

I would like to split the string on : but not the escaped one \\:
(my escape char is a double slash)

given: dtet:du\\,eduh ei\\:di:e,j
expected outcome: ["dtet"] ["du\\,eduh ei\\:di][e,j"]

regex link: https://regex101.com/r/12j6er/1/

like image 843
Seabizkit Avatar asked Apr 25 '26 12:04

Seabizkit


2 Answers

See the function below named splitOnNonEscapedDelimeter(), which accepts the string to split, and the delimeter to split on, which in this case is :. The usage is within the function onChange().

Note that you must escape the delimeter you pass to splitOnNonEscapedDelimeter(), so that it is not interpreted as a special character in the regular expression.

function nonEscapedDelimeter(delimeter) {
  return new RegExp(String.raw`[^${delimeter}]*?(?:\\\\${delimeter}[^${delimeter}]*?)*(?:${delimeter}|$)`, 'g')
}

function nonEscapedDelimeterAtEnd(delimeter) {
  return new RegExp(String.raw`([^\\].|.[^\\]|^.?)${delimeter}$`)
}

function splitOnNonEscapedDelimeter(string, delimeter) {
  const reMatch = nonEscapedDelimeter(delimeter)
  const reReplace = nonEscapedDelimeterAtEnd(delimeter)

  return string.match(reMatch).slice(0, -1).map(section => {
    return section.replace(reReplace, '$1')
  })
}

function onChange() {
  console.log(splitOnNonEscapedDelimeter(i.value, ':'))
}

i.addEventListener('change', onChange)

onChange()
<textarea id=i>dtet:du\\,eduh ei\\:di:e,j</textarea>

Requirements

This solution makes use of the ES2015 features String.raw() and template literals for convenience, though these are not required. See the relevant documentation above to understand how these work and use a polyfill such as this if your target platform does not include support for these features.

Explanation

new RegExp(String.raw`[^${delimeter}]*?(?:\\\\${delimeter}[^${delimeter}]*?)*(?:${delimeter}|$)`, 'g')

The function nonEscapedDelimeter() creates a regular expression that does almost what is required, except with a few quirks that need to be corrected with some post-processing.

string.match(reMatch)

The regular expression, when used in String#match(), splits the string into sections that either end with the non-escaped delimeter, or to the end of the string. This also has the side-effect of matching a 0-width section at the end of the string, thus we need to

.slice(0, -1)

to remove that match in post-processing.

new RegExp(String.raw`([^\\].|.[^\\]|^.?)${delimeter}$`)

...

.map(section => {
  return section.replace(reReplace, '')
})

Since each section now ends with the delimeter except for the last one (which ends at the end of the string), we need to .map() the array of matches and remove the non-escaped delimeter (thus why nonEscapedDelimeterAtEnd() is so complicated), if it is there.

like image 98
Patrick Roberts Avatar answered Apr 27 '26 01:04

Patrick Roberts


This is a little bit lengthy approach. but works for you. JavaScript regular expressions do not support lookbehinds. But you can do it by simply reverse your original string and split a string using lookahead. And then reverse array and all strings in it and you will get your result.

function reverse(s) {
  var o = '';
  for (var i = s.length - 1; i >= 0; i--)
    o += s[i];
  return o;
}


var str = "dtet:du\\,eduh ei\\:di:e,j";
var res = reverse(str);
var result  = res.split(/:(?!\\)/g);
result  = result.reverse();
for(var i = 0; i < result.length; i++){
	result[i] = reverse(result[i]);
}

console.log(result);
like image 40
Manoj Verma Avatar answered Apr 27 '26 01:04

Manoj Verma



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!