Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Storing the value NUL (ASCII) in XML

Tags:

java

xml

Is it possible to save the ASCII NUL character in XML like this <data>*NUL**NUL**NUL*</data>?

I know I can display this value in Java using System.out.println("\0") and I wonder if XML can handle this value.

*My objective is to get "\0\0\0" from XML using Java

Thank you in advance!

like image 667
Fai Avatar asked Feb 07 '23 16:02

Fai


1 Answers

By the specs for 1.0 it would not be allowed officially.

The ASCII NUL aka '\0' aka \u0000 is a normal character in java. In C/C++ however it is used as a string terminator. So when C software would process XML it probably would detect the end of the XML text way too early.

For this java also has a solution, namely when XML is written in the UTF-8 encoding Unicode values > 127 are encoded in a multibyte sequence with 8th bit 1. DataOutputStream.writeUTF8 writes the '\0` also as multi-byte sequence. So it is read normally, and the decoding works.

  • This is not entirely strict UTF-8 that requires the shortest encoding.
  • I am still unsure about errors in C of processing the XML DOM.

So it is not a good idea.

Also mind, binary data should be converted to Base64 ASCII instead. As UTF-8 is not suited for binary data.

like image 122
Joop Eggen Avatar answered Feb 20 '23 22:02

Joop Eggen