Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is the XML declaration tag case sensitive?

I have what is probably a really simple, studid question but I can't find an answer to it anywhere and I need to be pretty sure about this.

I have various XML files from various vendors. One of the vendors provide me an XML file with japanese characters in the file. Originally, I was having trouble processing the XML file (I'm using the MSXML SDK). The characters would come out wrong. I found that if the following was added to the XML file everything worked great.

<?xml version="1.0" encoding="UTF-16"?>

And so I asked the vendor to add this to their file. But they added it with the encoding in lower case:

<?xml version="1.0" encoding="utf-16"?>

And when I load this new file, with this declaration, I'm getting the same problem as when this declaration was not there.

What I'm trying to figure out (for sure) is if that encoding attribute is case sensitive (or is otherwise the problem). Does it matter that they put "utf-16" versus "UTF-16"?

Update: Under the advise of these who posted answers here, I setup and executed a test. One file had the lower case utf-16 and the other upper case. Other than that, the files were identical. This did not fix the problem and is not the problem. My conclusion is that MSXML is not case sensitive as the spec, posted in the answers, states.

like image 916
Frank V Avatar asked May 28 '09 15:05

Frank V


2 Answers

I suppose the question is not really "is the standrard case-sensitive?" but "is the encoding case-sensitive in MSXML SDK?"

From bytes.com:

The XML spec says that processors "SHOULD" be match encoding names case-insensitively. "SHOULD" is a technical term, less strong than "MUST", but I can't see any reason why a processor would not do it.

However, we know that this may not always be true in practice. If you can try both side-by-side, please do so and let us know what the result is.

like image 184
JoshJordan Avatar answered Oct 25 '22 15:10

JoshJordan


From the XML specs:

XML processors SHOULD match character encoding names in a case-insensitive way

So it's not needed but recommened to be case-insensitive, according to RFC 2119:

  1. SHOULD This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a
    particular item, but the full implications must be understood and
    carefully weighed before choosing a different course.
like image 33
schnaader Avatar answered Oct 25 '22 15:10

schnaader