If I have a string that contains combining diacritics, there seems to be some confusion between behaviour between different string functions. If I use String.IndexOf()
, it will combine the diacritic and find the correct character. If I use String.Split()
, for some reason it will not combine them and will not find the combined character.
Example code:
string test = "abce\u0308fgh";
Console.WriteLine(test.IndexOf("e"));
Console.WriteLine(test.IndexOf("ë"));
This will work as expected, meaning the e is not found, but the ë is. But for some reason this doesn't behave similarly:
string test = "abcde\u0308fgh";
Console.WriteLine(test.Split('e').Length.ToString());
Console.WriteLine(test.Split('ë').Length.ToString());
For some reason Split()
will not combine the diacritic and will split by e, but not by ë.
Is there some reason for this functionality and is there a way to either have an IndexOf()
function that doesn't combine the diacritic, or preferably a Split()
function that does?
Edit: Noticed I had earlier written wrong code, it had 'e' and not "e"
string test = "abce\u0308fgh";
Console.WriteLine(test.IndexOf('e'));
Console.WriteLine(test.IndexOf('ë'));
This behaves as the Split()
also, so it is not between the methods, it's between taking a character or a string.
Actually, when I copy and paste your example code into a blank program, I get exactly the behavior I might expect: both IndexOf()
and Split()
do not treat the combined character as the passed in ë
search character. I.e. the call to IndexOf('ë')
returns -1 for me, consistent with how you describe the behavior of Split()
.
That said, if you want Split()
to treat such two-character representations of single-character versions as if they were in fact originally the single-character version, you can just call string.Normalize()
before Split()
. For example:
Console.WriteLine(test.Normalize().Split('ë').Length);
The Normalize()
method has an overload to let you control the exact type of normalization, should that be required (it's not in the example you've provided).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With