Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split string by delimiter, but not if it is escaped

How can I split a string by a delimiter, but not if it is escaped? For example, I have a string:

1|2\|2|3\\|4\\\|4 

The delimiter is | and an escaped delimiter is \|. Furthermore I want to ignore escaped backslashes, so in \\| the | would still be a delimiter.

So with the above string the result should be:

[0] => 1 [1] => 2\|2 [2] => 3\\ [3] => 4\\\|4 
like image 834
Anton Avatar asked Jun 05 '11 15:06

Anton


People also ask

What does split return if delimiter not found?

split (separator, limit) , if the separator is not in the string, it returns a one-element array with the original string in it.

How do you split a string without delimiter?

Q #4) How to split a string in Java without delimiter or How to split each character in Java? Answer: You just have to pass (“”) in the regEx section of the Java Split() method. This will split the entire String into individual characters.

When splitting a string using a given separator it returns?

The Split method extracts the substrings in this string that are delimited by one or more of the strings in the separator parameter, and returns those substrings as elements of an array.


2 Answers

Use dark magic:

$array = preg_split('~\\\\.(*SKIP)(*FAIL)|\|~s', $string); 

\\\\. matches a backslash followed by a character, (*SKIP)(*FAIL) skips it and \| matches your delimiter.

like image 155
NikiC Avatar answered Oct 07 '22 21:10

NikiC


Instead of split(...), it's IMO more intuitive to use some sort of "scan" function that operates like a lexical tokenizer. In PHP that would be the preg_match_all function. You simply say you want to match:

  1. something other than a \ or |
  2. or a \ followed by a \ or |
  3. repeat #1 or #2 at least once

The following demo:

$input = "1|2\\|2|3\\\\|4\\\\\\|4"; echo $input . "\n\n"; preg_match_all('/(?:\\\\.|[^\\\\|])+/', $input, $parts); print_r($parts[0]); 

will print:

1|2\|2|3\\|4\\\|4  Array (     [0] => 1     [1] => 2\|2     [2] => 3\\     [3] => 4\\\|4 ) 
like image 30
Bart Kiers Avatar answered Oct 07 '22 21:10

Bart Kiers