Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# Extension Method - String Split that also accepts an Escape Character

I'd like to write an extension method for the .NET String class. I'd like it to be a special varation on the Split method - one that takes an escape character to prevent splitting the string when a escape character is used before the separator.

What's the best way to write this? I'm curious about the best non-regex way to approach it.
Something with a signature like...

public static string[] Split(this string input, string separator, char escapeCharacter)
{
   // ...
}

UPDATE: Because it came up in one the comments, the escaping...

In C# when escaping non-special characters you get the error - CS1009: Unrecognized escape sequence.

In IE JScript the escape characters are throw out. Unless you try \u and then you get a "Expected hexadecimal digit" error. I tested Firefox and it has the same behavior.

I'd like this method to be pretty forgiving and follow the JavaScript model. If you escape on a non-separator it should just "kindly" remove the escape character.

like image 944
BuddyJoe Avatar asked Mar 11 '09 14:03

BuddyJoe


2 Answers

How about:

public static IEnumerable<string> Split(this string input, 
                                        string separator,
                                        char escapeCharacter)
{
    int startOfSegment = 0;
    int index = 0;
    while (index < input.Length)
    {
        index = input.IndexOf(separator, index);
        if (index > 0 && input[index-1] == escapeCharacter)
        {
            index += separator.Length;
            continue;
        }
        if (index == -1)
        {
            break;
        }
        yield return input.Substring(startOfSegment, index-startOfSegment);
        index += separator.Length;
        startOfSegment = index;
    }
    yield return input.Substring(startOfSegment);
}

That seems to work (with a few quick test strings), but it doesn't remove the escape character - that will depend on your exact situation, I suspect.

like image 146
Jon Skeet Avatar answered Oct 24 '22 09:10

Jon Skeet


This will need to be cleaned up a bit, but this is essentially it....

List<string> output = new List<string>();
for(int i=0; i<input.length; ++i)
{
    if (input[i] == separator && (i==0 || input[i-1] != escapeChar))
    {
        output.Add(input.substring(j, i-j);
        j=i;
    }
}

return output.ToArray();
like image 7
James Curran Avatar answered Oct 24 '22 09:10

James Curran