Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unicode characters like \u0016 in XML

Tags:

xml

unicode

Is there a way to handle unicode characters like \u0016 in XML? As per my understanding, loading such characters in XMLDocument throws an invalid hexadecimal character error. I tried with other unicode characters. They seem to work fine. Only the control characters cause this error. Can we remove these characters without actual parsing the XML?

like image 709
user1081449 Avatar asked Dec 13 '11 06:12

user1081449


People also ask

How do I use Unicode characters in XML?

Characters are denoted using the notation used in the Unicode Standard, that is, an optional U+ followed by their hexadecimal number, using at least 4 digits, such as "U+1234" or "U+10FFFD". In XML or HTML this could be expressed as "ሴ" or "􏿽".

Is Unicode allowed in XML?

XML does not support certain Unicode characters (the NUL character, anything in XML's RestrictedChar category, and permanently undefined Unicode characters). However, you can accidentally send them through the REST API. For more information about these characters, go to section 2.2 of the XML 1.1 specification .

What is &# 10 in XML?

The unicode is 
 and it's being used in an XML document. That's not unicode, it's a numeric character entity.

What characters are Unicode?

Unicode provides a unique number for every character including punctuation marks, mathematical symbols, technical symbols, arrows, and characters making up non-Latin alphabets such as Thai, Chinese, or Arabic script.


1 Answers

Characters are denoted using the notation used in the Unicode Standard, that is, an optional U+ followed by their hexadecimal number, using at least 4 digits, such as U+1234 or U+10FFFD. In XML or HTML this could be expressed as ሴ or 􏿽.

from Unicode Technical Report.

Valid characters in XML:

Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]

from Extensible Markup Language (XML) 1.0 (Fifth Edition)

like image 124
scessor Avatar answered Sep 27 '22 22:09

scessor