Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is encoding Cp1252 invalid in an XML file?

Tags:

java

xml

encoding

Some XML file I ran across is failing a well-formed XML check, even though it looks well-formed to me (I might be wrong.)

I have reduced it to a trivial example:

<?xml version="1.0" encoding="Cp1252"?>
<jnlp/>

The method being used to do the check works like this:

public static boolean isWellFormedXml(InputStream inputStream) {
    try {
        XMLInputFactory inputFactory = XMLInputFactory.newInstance();
        inputFactory.setProperty(XMLInputFactory.IS_COALESCING, false);
        inputFactory.setProperty(XMLInputFactory.SUPPORT_DTD, false);
        XMLStreamReader reader = inputFactory.createXMLStreamReader(stream);
        try {
            // Scan through all the reader tokens to ensure everything is well formed
            while (reader.hasNext()) {
                reader.next();
            }
        } finally {
            reader.close();
        }
    } catch (XMLStreamException e) {
        // Ignore the exception
        return false;
    }

    return true;
}

The error I'm seeing is:

javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,40]

Message: Invalid encoding name "Cp1252".

Only problem is - I can breakpoint at the catch and confirm that this encoding name does resolve. So what's the deal here? Does XML also restrict which encodings you're allowed to use in the prologue?

like image 525
Hakanai Avatar asked Oct 04 '22 08:10

Hakanai


1 Answers

check:

http://www.iana.org/assignments/character-sets/character-sets.xml

i guess the encoding you're looking for COULD be windows-1252. Cp1252 might be a valid charset in java, but in XML, you're not supposed to use it (by that name).

like image 135
rmalchow Avatar answered Oct 13 '22 11:10

rmalchow