Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to compare 'μ' and 'µ' in C# [duplicate]

I fall into a surprising issue.

I loaded a text file in my application and I have some logic which compares the value having µ.

And I realized that even if the texts are same the compare value is false.

 Console.WriteLine("μ".Equals("µ")); // returns false  Console.WriteLine("µ".Equals("µ")); // return true 

In later line the character µ is copy pasted.

However, these might not be the only characters that are like this.

Is there any way in C# to compare the characters which look the same but are actually different?

like image 258
D J Avatar asked Dec 19 '13 06:12

D J


1 Answers

Because it is really different symbols even they look the same, first is the actual letter and has char code = 956 (0x3BC) and the second is the micro sign and has 181 (0xB5).

References:

  • Unicode Character 'GREEK SMALL LETTER MU' (U+03BC)
  • Unicode Character 'MICRO SIGN' (U+00B5)

So if you want to compare them and you need them to be equal, you need to handle it manually, or replace one char with another before comparison. Or use the following code:

public void Main() {     var s1 = "μ";     var s2 = "µ";      Console.WriteLine(s1.Equals(s2));  // false     Console.WriteLine(RemoveDiacritics(s1).Equals(RemoveDiacritics(s2))); // true  }  static string RemoveDiacritics(string text)  {     var normalizedString = text.Normalize(NormalizationForm.FormKC);     var stringBuilder = new StringBuilder();      foreach (var c in normalizedString)     {         var unicodeCategory = CharUnicodeInfo.GetUnicodeCategory(c);         if (unicodeCategory != UnicodeCategory.NonSpacingMark)         {             stringBuilder.Append(c);         }     }      return stringBuilder.ToString().Normalize(NormalizationForm.FormC); } 

And the Demo

like image 164
Tony Avatar answered Sep 18 '22 00:09

Tony