Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to replace special characters with their equivalent (such as " á " for " a") in C#?

I need to get the Portuguese text content out of an Excel file and create an xml which is going to be used by an application that doesn't support characters such as "ç", "á", "é", and others. And I can't just remove the characters, but replace them with their equivalent ("c", "a", "e", for example).

I assume there's a better way to do it than check each character individually and replace it with their counterparts. Any suggestions on how to do it?

like image 725
jehuty Avatar asked Mar 06 '10 19:03

jehuty


People also ask

How do you replace a special character in a string?

JavaScript replace() method is used to replace all special characters from a string with _ (underscore) which is described below: replace() method: This method searches a string for a defined value, or a regular expression, and returns a new string with the replaced defined value.

Is a * a special character?

A special character is one that is not considered a number or letter. Symbols, accent marks, and punctuation marks are considered special characters.

What is the regex for special characters?

Special Regex Characters: These characters have special meaning in regex (to be discussed below): . , + , * , ? , ^ , $ , ( , ) , [ , ] , { , } , | , \ . Escape Sequences (\char): To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ).


2 Answers

You could try something like

var decomposed = "áéö".Normalize(NormalizationForm.FormD);
var filtered = decomposed.Where(c => char.GetUnicodeCategory(c) != UnicodeCategory.NonSpacingMark);
var newString = new String(filtered.ToArray());

This decomposes accents from the text, filters them and creates a new string. Combining diacritics are in the Non spacing mark unicode category.

like image 96
Ben Lings Avatar answered Oct 14 '22 20:10

Ben Lings


string text = {text to replace characters in};

Dictionary<char, char> replacements = new Dictionary<char, char>();

// add your characters to the replacements dictionary, 
// key: char to replace
// value: replacement char

replacements.Add('ç', 'c');
...

System.Text.StringBuilder replaced = new System.Text.StringBuilder();
for (int i = 0; i < text.Length; i++)
{
    char character = text[i];
    if (replacements.ContainsKey(character))
    {
        replaced.Append(replacements[character]);
    }
    else
    {
        replaced.Append(character);
    }
}

// 'replaced' is now your converted text
like image 30
Zach Johnson Avatar answered Oct 14 '22 20:10

Zach Johnson