Ignoring accented letters in string comparison

EDIT 2012-01-20: Oh boy! The solution was so much simpler and has been in the framework nearly forever. As pointed out by knightpfhor :

string.Compare(s1, s2, CultureInfo.CurrentCulture, CompareOptions.IgnoreNonSpace);

Here's a function that strips diacritics from a string:

static string RemoveDiacritics(string text)
  string formD = text.Normalize(NormalizationForm.FormD);
  StringBuilder sb = new StringBuilder();

  foreach (char ch in formD)
    UnicodeCategory uc = CharUnicodeInfo.GetUnicodeCategory(ch);
    if (uc != UnicodeCategory.NonSpacingMark)

  return sb.ToString().Normalize(NormalizationForm.FormC);

More details on MichKap's blog (RIP...).

The principle is that is it turns 'é' into 2 successive chars 'e', acute. It then iterates through the chars and skips the diacritics.

"héllo" becomes "he<acute>llo", which in turn becomes "hello".


Note: Here's a more compact .NET4+ friendly version of the same function:

static string RemoveDiacritics(string text)
  return string.Concat( 
      .Where(ch => CharUnicodeInfo.GetUnicodeCategory(ch)!=

If you don't need to convert the string and you just want to check for equality you can use

string s1 = "hello";
string s2 = "héllo";

if (String.Compare(s1, s2, CultureInfo.CurrentCulture, CompareOptions.IgnoreNonSpace) == 0)
    // both strings are equal

or if you want the comparison to be case insensitive as well

string s1 = "HEllO";
string s2 = "héLLo";

if (String.Compare(s1, s2, CultureInfo.CurrentCulture, CompareOptions.IgnoreNonSpace | CompareOptions.IgnoreCase) == 0)
    // both strings are equal

I had to do something similar but with a StartsWith method. Here is a simple solution derived from @Serge - appTranslator.

Here is an extension method:

    public static bool StartsWith(this string str, string value, CultureInfo culture, CompareOptions options)
        if (str.Length >= value.Length)
            return string.Compare(str.Substring(0, value.Length), value, culture, options) == 0;
            return false;            

And for one liners freaks ;)

    public static bool StartsWith(this string str, string value, CultureInfo culture, CompareOptions options)
        return str.Length >= value.Length && string.Compare(str.Substring(0, value.Length), value, culture, options) == 0;

Accent incensitive and case incensitive startsWith can be called like this

value.ToString().StartsWith(str, CultureInfo.InvariantCulture, CompareOptions.IgnoreNonSpace | CompareOptions.IgnoreCase)

The following method CompareIgnoreAccents(...) works on your example data. Here is the article where I got my background information: http://www.codeproject.com/KB/cs/EncodingAccents.aspx

private static bool CompareIgnoreAccents(string s1, string s2)
    return string.Compare(
        RemoveAccents(s1), RemoveAccents(s2), StringComparison.InvariantCultureIgnoreCase) == 0;

private static string RemoveAccents(string s)
    Encoding destEncoding = Encoding.GetEncoding("iso-8859-8");

    return destEncoding.GetString(
        Encoding.Convert(Encoding.UTF8, destEncoding, Encoding.UTF8.GetBytes(s)));

I think an extension method would be better:

public static string RemoveAccents(this string s)
    Encoding destEncoding = Encoding.GetEncoding("iso-8859-8");

    return destEncoding.GetString(
        Encoding.Convert(Encoding.UTF8, destEncoding, Encoding.UTF8.GetBytes(s)));

Then the use would be this:

if(string.Compare(s1.RemoveAccents(), s2.RemoveAccents(), true) == 0) {

A more simple way to remove accents:

    Dim source As String = "áéíóúç"
    Dim result As String

    Dim bytes As Byte() = Encoding.GetEncoding("Cyrillic").GetBytes(source)
    result = Encoding.ASCII.GetString(bytes)