I am parsing some delimiter separated values, where ?
is specified as the escape character in case the delimiter appears as part of one of the values.
For instance: if :
is the delimiter, and a certain field the value 19:30
, this needs to be written as 19?:30
.
Currently, I use string[] values = input.Split(':');
in order to get an array of all values, but after learning about this escape character, this won't work anymore.
Is there a way to make Split
take escape characters into account? I have checked the overload methods, and there does not seem to be such an option directly.
string[] substrings = Regex.Split("aa:bb:00?:99:zz", @"(?<!\?):");
for
aa
bb
00?:99
zz
Or as you probably want to unescape ?: at some point, replace the sequence in the input with another token, split and replace back.
(This requires the System.Text.RegularExpressions
namespace to be used.)
This kind of stuff is always fun to code without using Regex.
The following does the trick with one single caveat: the escape character will always escape, it has no logic to check for only valid ones: ?;
. So the string one?two;three??;four?;five
will be split into onewo
, three?
, fourfive
.
public static IEnumerable<string> Split(this string text, char separator, char escapeCharacter, bool removeEmptyEntries)
{
string buffer = string.Empty;
bool escape = false;
foreach (var c in text)
{
if (!escape && c == separator)
{
if (!removeEmptyEntries || buffer.Length > 0)
{
yield return buffer;
}
buffer = string.Empty;
}
else
{
if (c == escapeCharacter)
{
escape = !escape;
if (!escape)
{
buffer = string.Concat(buffer, c);
}
}
else
{
if (!escape)
{
buffer = string.Concat(buffer, c);
}
escape = false;
}
}
}
if (buffer.Length != 0)
{
yield return buffer;
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With