Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unicode string in XML

Tags:

xml

unicode

In xml unicode are represented as follows:

e.g:

\ue349 

What if I want to write a string consists of two chars with unicodes e343 e312

How can this be represented in XML?

like image 671
Noha Nhe Avatar asked Jul 21 '12 12:07

Noha Nhe


People also ask

How do I use Unicode characters in XML?

Characters are denoted using the notation used in the Unicode Standard, that is, an optional U+ followed by their hexadecimal number, using at least 4 digits, such as "U+1234" or "U+10FFFD". In XML or HTML this could be expressed as "ሴ" or "􏿽".

Can XML contain Unicode?

Unicode code points in the following code point ranges are always valid in XML 1.1 documents: U+0001–U+D7FF, U+E000–U+FFFD: this includes most C0 and C1 control characters, but excludes some (not all) non-characters in the BMP (surrogates, U+FFFE and U+FFFF are forbidden);

What is Unicode string type?

Unicode is a standard encoding system that is used to represent characters from almost all languages. Every Unicode character is encoded using a unique integer code point between 0 and 0x10FFFF . A Unicode string is a sequence of zero or more code points.


1 Answers

XML does not use \ue349 notation. Character references, starting with &#, may be used, but they are mostly not needed. XML is usually used with UTF-8 character encoding, so that each character can be written as such. (When generating XML in a program, you might well use a notation like \ue349 if supported by the programming language.)

In Unicode, the numbers E343 and E312 refer to Private Use codepoints, to which no character is assigned by the standard. They may be used by private agreements as desired, but you should not expect any software or any person to understand them, except by such agreements. With this in mind, the code points U+E343 U+E312 (and hence the characters they may denote by some agreement) can be written as .

like image 177
Jukka K. Korpela Avatar answered Sep 28 '22 08:09

Jukka K. Korpela