In xml unicode are represented as follows:
e.g:
\ue349
What if I want to write a string consists of two chars with unicodes e343 e312
How can this be represented in XML?
Characters are denoted using the notation used in the Unicode Standard, that is, an optional U+ followed by their hexadecimal number, using at least 4 digits, such as "U+1234" or "U+10FFFD". In XML or HTML this could be expressed as "ሴ" or "􏿽".
Unicode code points in the following code point ranges are always valid in XML 1.1 documents: U+0001–U+D7FF, U+E000–U+FFFD: this includes most C0 and C1 control characters, but excludes some (not all) non-characters in the BMP (surrogates, U+FFFE and U+FFFF are forbidden);
Unicode is a standard encoding system that is used to represent characters from almost all languages. Every Unicode character is encoded using a unique integer code point between 0 and 0x10FFFF . A Unicode string is a sequence of zero or more code points.
XML does not use \ue349
notation. Character references, starting with &#
, may be used, but they are mostly not needed. XML is usually used with UTF-8 character encoding, so that each character can be written as such. (When generating XML in a program, you might well use a notation like \ue349
if supported by the programming language.)
In Unicode, the numbers E343 and E312 refer to Private Use codepoints, to which no character is assigned by the standard. They may be used by private agreements as desired, but you should not expect any software or any person to understand them, except by such agreements. With this in mind, the code points U+E343 U+E312 (and hence the characters they may denote by some agreement) can be written as 
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With