Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unicode characters string

Tags:

I have the following String of characters.

string s = "\\u0625\\u0647\\u0644"; 

When I print the above sequence, I get:

\u0625\u0647\u062 

How can I get the real printable Unicode characters instead of this \uxxxx representation?

like image 562
Marc Andreson Avatar asked Jul 28 '12 11:07

Marc Andreson


People also ask

What is a Unicode character example?

The code point is a unique number for a character or some symbol such as an accent mark or ligature. Unicode supports more than a million code points, which are written with a "U" followed by a plus sign and the number in hex; for example, the word "Hello" is written U+0048 U+0065 U+006C U+006C U+006F (see hex chart).

How do you make a string containing Unicode characters?

You have two options to create Unicode string in Python. Either use decode() , or create a new Unicode string with UTF-8 encoding by unicode(). The unicode() method is unicode(string[, encoding, errors]) , its arguments should be 8-bit strings.

What is Unicode string in Java?

Unicode is an international standard of character encoding which has the capability of representing a majority of written languages all over the globe. Unicode uses hexadecimal to represent a character. Unicode is a 16-bit character encoding system. The lowest value is \u0000 and the highest value is \uFFFF.

What is the difference between Unicode string and string?

Unicode, on the other hand, has tens of thousands of characters. That means that each Unicode character takes more than one byte, so you need to make the distinction between characters and bytes. Standard Python strings are really byte strings, and a Python character is really a byte.


2 Answers

If you really don't control the string, then you need to replace those escape sequences with their values:

Regex.Replace(s, @"\u([0-9A-Fa-f]{4})", m => ((char)Convert.ToInt32(m.Groups[1].Value, 16)).ToString()); 

and hope that you don't have \\ escapes in there too.

like image 105
Joey Avatar answered Sep 23 '22 04:09

Joey


Asker posted this as an answer to their question:

I have found the answer:

s = System.Text.RegularExpressions.Regex.Unescape(s); 
like image 42
Ruzihm Avatar answered Sep 22 '22 04:09

Ruzihm