Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which characters are Invalid (unless encoded) in an XML attribute?

I can't believe I can't find this information easily accessible, so:

1) Which characters cannot be incorporated in an XML attribute without entity-encoding them?

Obviously, you need to encode quotes. What about < and >? What else?

2) Where exactly is the official list?

like image 728
Euro Micelli Avatar asked May 15 '09 02:05

Euro Micelli


People also ask

What characters are invalid in XML?

The only illegal characters are & , < and > (as well as " or ' in attributes, depending on which character is used to delimit the attribute value: attr="must use &quot; here, ' is allowed" and attr='must use &apos; here, " is allowed' ). They're escaped using XML entities, in this case you want &amp; for & .

How do I find an invalid character in XML?

If you're unable to identify this character visually, then you can use a text editor such as TextPad to view your source file. Within the application, use the Find function and select "hex" and search for the character mentioned. Removing these characters from your source file resolve the invalid XML character issue.

Does XML accept special characters?

When you use wizards to customize any string in your XML file, you can use the following special symbols: <, >, &, ', ". You can also use these symbols when you are editing a query in Expert Mode or when you are manually entering SQL code into XML files between CDATA tags.

How do I find special characters in XML?

Open an XML document in the text editing mode, right click inside it and there is a new menu item "Determine Complex Layout Chars".


1 Answers

Here is the definition of what is allowed in an attribute value.

'"' ([^<&"] | Reference)* '"'  |  "'" ([^<&'] | Reference)* "'"  

So, you can't have:

  • the same character that opens/closes the attribute value (either ' or ")
  • a naked ampersand (& must be &amp;)
  • a left angle bracket (< must be &lt;)

You should also not being using any characters that are outright not legal anywhere in an XML document (such as form feeds, etc).

like image 67
great_llama Avatar answered Oct 22 '22 05:10

great_llama