I have the following method in my C# code:
/// <summary>
/// Removes the first (leftmost) occurence of a <paramref name="substring"/> from a <paramref name="string"/>.
/// </summary>
/// <param name="string">The string to remove the <paramref name="substring"/> from. Cannot be <c>null</c>.</param>
/// <param name="substring">The substring to look for and remove from the <paramref name="string"/>. Cannot be <c>null</c>.</param>
/// <returns>
/// The rest of the <paramref name="string"/>, after the first (leftmost) occurence of the <paramref name="substring"/> in it (if any) has been removed.
/// </returns>
/// <remarks>
/// <list type="bullet">
/// <item>If the <paramref name="substring"/> does not occur within the <paramref name="string"/>, the <paramref name="string"/> is returned intact.</item>
/// <item>If the <paramref name="substring"/> has exactly one occurence within the <paramref name="string"/>, that occurence is removed, and the rest of the <paramref name="string"/> is returned.</item>
/// <item>If the <paramref name="substring"/> has several occurences within the <paramref name="substring"/>, the first (leftmost) occurence is removed, and the rest of the <paramref name="string"/> is returned.</item>
/// </list>
/// </remarks>
/// <exception cref="ArgumentNullException">
/// The <paramref name="string"/> is <c>null</c>. -or- The <paramref name="substring"/> is <c>null</c>.
/// </exception>
public static string RemoveSubstring(string @string, string substring)
{
if (@string == null)
throw new ArgumentNullException("string");
if (substring == null)
throw new ArgumentNullException("substring");
var index = @string.IndexOf(substring);
return index == -1
? @string
: @string.Substring(0, index) + @string.Substring(index + substring.Length);
}
The implementation looks very simple and obvious, and has an excellent coverage by unit tests. No unexpected results ever occurred on my machine, build servers, or any other machines I have an access to, or in most production environments.
Except that only one remote customer occasionally reports an application crash at this method with the following stack trace:
System.ArgumentOutOfRangeException: startIndex cannot be larger than length of string.
Parameter name: startIndex
at System.String.InternalSubStringWithChecks(Int32 startIndex, Int32 length, Boolean fAlwaysCopy)
at System.String.Substring(Int32 startIndex)
at MyNamespace.StringUtils.RemoveSubstring(String string, String substring)
at ...
Unfortunately, I do not have a remote access to this production environment or to its data, or to any additional information. For some reasons, currently I am unable to deploy logging system or crash dump collection there.
Looking at the code, and trying different combinations of arguments, I cannot imagine how this exception could possibly occur.
Could you please help me with some ideas?
RemoveSubstring("A", "A\uFFFD"); // throws ArgumentOutOfRangeException
RemoveSubstring("A", "A\u0640"); // throws ArgumentOutOfRangeException
Many functions for string-manipulation in .NET, including IndexOf
are culture-specific by default (usually there are overloads where you can pass StringComparison.Ordinal
or StringComparer.Ordinal
to switch to bitwise comparisons). Personally, I am not very happy with what was chosen to be the default behavior, but it is too late to do anything about it, except possibly, setting explicit development guidelines and FxCop rules.
But sometimes culture-specific operations are exactly what you need. Unfortunately, their semantics can be tricky and counter-intuitive, can violate some normally-assumed invariants, and have a lot of corner cases to take care of. Developers who are in charge of implementing a culture-sensitive logic in an application should be very qualified in this area and always understand exactly what they are doing. I would recommend to set review and testing standards for this area above normal.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With