When I run the following code in .NET Core 3.1, I get 6
as the return value.
// .NET Core 3.1
string s = "Hello\r\nworld!";
int idx = s.IndexOf("\n");
Console.WriteLine(idx);
Result:
6
But when I run this code in .NET 5.0, I get a different result. Why does this happen?
// .NET 5.0
string s = "Hello\r\nworld!";
int idx = s.IndexOf("\n");
Console.WriteLine(idx);
Result:
-1
The comments and @Ray's answer contain the reason.
And though hacking the .csproj
or runtimeconfig.json
file may save your day the real solution is to specify the comparison explicitly:
// this returns the expected result
int idx = s.IndexOf("\n", StringComparison.Ordinal);
For some reason IndexOf(string)
defaults to use current culture comparison, which can cause surprises even with earlier .NET versions when your app is executed in an environment that has different regional settings than yours.
Using a culture-specific search is actually a very rare scenario (can be valid in a browser, book reader or UI search, for example) and it is much slower than ordinal search.
The same issue applies for StartsWith
/EndsWith
/Contains
/ToUpper
/ToLower
and even ToString
and Parse
methods of formattable types (especially when using floating-point types) as these also use the current culture by default, which can be the source of many gotchas. But recent code analyzers (eg. FxCop, ReSharper) can warn you if you don't use a specific comparison or culture. It is recommended to set a high severity for these issues in a product code.
Your sample code exactly matches the one posted on MSDN which also describes the why and how to revert to the old behavior in these excerpts (emphases mine):
In the past, the .NET globalization APIs used different underlying libraries on different platforms. On Unix, the APIs used International Components for Unicode (ICU), and on Windows, they used National Language Support (NLS). [...] Behavior differences were evident in these areas:
- Cultures and culture data
- String casing
- String sorting and searching
- Sort keys
- String normalization
- Internationalized Domain Names (IDN) support
- Time zone display name on Linux
To revert back to using NLS [as relevant for Windows 10 May 2019 Update and newer which now uses ICU by default], a developer can opt out of the ICU implementation. Applications can enable NLS mode in any of the following ways:
In the project file:
<ItemGroup> <RuntimeHostConfigurationOption Include="System.Globalization.UseNls" Value="true" /> </ItemGroup>
In the
runtimeconfig.json
file:{ "runtimeOptions": { "configProperties": { "System.Globalization.AppLocalIcu": "<suffix>:<version> or <version>" } } }
By setting the environment variable
DOTNET_SYSTEM_GLOBALIZATION_APPLOCALICU
to the value<suffix>:<version>
or<version>
.
<suffix>
: Optional suffix of fewer than 36 characters in length, following the public ICU packaging conventions. When building a custom ICU, you can customize it to produce the lib names and exported symbol names to contain a suffix, for example, libicuucmyapp, where myapp is the suffix.
<version>
: A valid ICU version, for example, 67.1. This version is used to load the binaries and to get the exported symbols.
For more / up-to-date information, please refer to the MSDN link above.
However, I recommend reading up on György Kőszeg's answer aswell, as you'd only have to worry about these details from inexact string operations to begin with.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With