Why does every Char static "Is..." have a string overload, e.g. IsWhiteSpace(string, Int32)?

Question

http://msdn.microsoft.com/en-us/library/1x308yk8.aspx

This allows me to do this:

var str = "string ";
Char.IsWhiteSpace(str, 6);

Rather than:

Char.IsWhiteSpace(str[6]);

Seems unusual, so I looked at the reflection:

[TargetedPatchingOptOut("Performance critical to inline across NGen image boundaries")]
public static bool IsWhiteSpace(char c)
{
    if (char.IsLatin1(c))
    {
        return char.IsWhiteSpaceLatin1(c);
    }
    return CharUnicodeInfo.IsWhiteSpace(c);
}

[SecuritySafeCritical]
public static bool IsWhiteSpace(string s, int index)
{
    if (s == null)
    {
        throw new ArgumentNullException("s");
    }
    if (index >= s.Length)
    {
        throw new ArgumentOutOfRangeException("index");
    }
    if (char.IsLatin1(s[index]))
    {
        return char.IsWhiteSpaceLatin1(s[index]);
    }
    return CharUnicodeInfo.IsWhiteSpace(s, index);
}

Three things struck me:

Why does it bother to do the limit check only on the upper bound? Throwing an ArgumentOutOfRangeException, while index below 0 would give string's standard IndexOutOfRangeException
The precense of SecuritySafeCriticalAttribute which I've read the general blerb about, but still unclear what it is doing here and if it is linked to the upper bound check.
TargetedPatchingOptOutAttribute is not present on other Is...(char) methods. Example IsLetter, IsNumber etc.

Esailija · Accepted Answer

Because not every character fits in a C# char. For instance, "𠀀" takes 2 C# chars, and you couldn't get any information about that character with just a char overload. With String and an index, the methods can see if the character at index i is a High Surrogate char, and then read the Low Surrogate char at next index, add them up according to the algorithm, and retrieve info about the code point U+20000.

This is how UTF-16 can encode 1 million different code points, it's a variable-width encoding. It takes 2-4 bytes to encode a character, or 1-2 C# chars.

Why does every Char static "Is..." have a string overload, e.g. IsWhiteSpace(string, Int32)?

Tags:

c#

weston

1 Answers

Esailija

Recent Activity

Donate For Us

Why does every Char static "Is..." have a string overload, e.g. IsWhiteSpace(string, Int32)?

Tags:

c#

weston

1 Answers

Esailija

Related questions

Recent Activity

Donate For Us