I'd like to write an extension method for the .NET String class. I'd like it to be a special varation on the Split method - one that takes an escape character to prevent splitting the string when a escape character is used before the separator.
What's the best way to write this? I'm curious about the best non-regex way to approach it.
Something with a signature like...
public static string[] Split(this string input, string separator, char escapeCharacter)
{
// ...
}
UPDATE: Because it came up in one the comments, the escaping...
In C# when escaping non-special characters you get the error - CS1009: Unrecognized escape sequence.
In IE JScript the escape characters are throw out. Unless you try \u and then you get a "Expected hexadecimal digit" error. I tested Firefox and it has the same behavior.
I'd like this method to be pretty forgiving and follow the JavaScript model. If you escape on a non-separator it should just "kindly" remove the escape character.
How about:
public static IEnumerable<string> Split(this string input,
string separator,
char escapeCharacter)
{
int startOfSegment = 0;
int index = 0;
while (index < input.Length)
{
index = input.IndexOf(separator, index);
if (index > 0 && input[index-1] == escapeCharacter)
{
index += separator.Length;
continue;
}
if (index == -1)
{
break;
}
yield return input.Substring(startOfSegment, index-startOfSegment);
index += separator.Length;
startOfSegment = index;
}
yield return input.Substring(startOfSegment);
}
That seems to work (with a few quick test strings), but it doesn't remove the escape character - that will depend on your exact situation, I suspect.
This will need to be cleaned up a bit, but this is essentially it....
List<string> output = new List<string>();
for(int i=0; i<input.length; ++i)
{
if (input[i] == separator && (i==0 || input[i-1] != escapeChar))
{
output.Add(input.substring(j, i-j);
j=i;
}
}
return output.ToArray();
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With