Is it possible to convert language specific characters to latin characters in UTF8

Question

I am wondering if there are any relationships or existing algorithms allowing converting from national characters to equivalent Latin characters within the UTF8 codepage?

For example (in Polish):

Ą -> A

Ó -> O

ż -> z

ź -> z ...

phrase like: 'zażółć gęślą jażń'

converts to: 'zazolc gesla jazn'

Currently I am using a conversion array for Polish, but I am looking for a universal solution handling all Latin based languages.

Thanks

carlo.borreo · Accepted Answer

Check this:

http://sourceforge.net/projects/iconvnet/

In general, search for something called iconv

tomekole · Answer

To make the answer complete, the 'Unicode decomposition + C#' led me to this CodeProject article (codeproject.com/KB/cs/UnicodeNormalization.aspx?display=Print) which offers a ready to use solution. The ability to name what you are looking for can't be underestimated ;) Thanks for all answers.

Is it possible to convert language specific characters to latin characters in UTF8

Tags:

unicode

c#-4.0

tomekole

2 Answers

carlo.borreo

tomekole

Recent Activity

Donate For Us

Is it possible to convert language specific characters to latin characters in UTF8

Tags:

unicode

c#-4.0

tomekole

2 Answers

carlo.borreo

tomekole

Related questions

Recent Activity

Donate For Us