Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the best practices for handling Unicode strings in C#? [closed]

Tags:

c#

unicode

Can somebody please provide me some important aspects I should be aware of while handling Unicode strings in C#?

like image 616
Vijesh VP Avatar asked Sep 27 '08 20:09

Vijesh VP


1 Answers

Keep in mind that C# strings are sequnces of Char, UTF-16 code units. They are not Unicode code-points. Some unicode code points require two Char's, and you should not split strings between these Chars.

In addition, unicode code points may combine to form a single language 'character' -- for instance, a 'u' Char followed by umlat Char. So you can't split strings between arbitrary code points either.

Basically, it's mess of issues, where any given issue may only in practice affect languages you don't know.

like image 191
Aaron Avatar answered Oct 21 '22 14:10

Aaron