Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is an xs:NCName type and when should it be used?

@skyl practically provoked me to write this answer so please mind the redundancy.

NCName stands for "non-colonized name". NCName can be defined as an XML Schema regular expression [\i-[:]][\c-[:]]*

...and what does that regex mean?

\i and \c are multi-character escapes defined in XML Schema definition.
http://www.w3.org/TR/xmlschema-2/#dt-ccesN
\i is the escape for the set of initial XML name characters and \c is the set of XML name characters. [\i-[:]] means a set that consist of the set \i excluding a set that consist of the colon character :. So in plain English it would mean "any initial character, but not :". The whole regular expression reads as "One initial XML name character, but not a colon, followed by zero or more XML name characters, but not a colon."

Practical restrictions of an NCName

The practical restrictions of NCName are that it cannot contain several symbol characters like :, @, $, %, &, /, +, ,, ;, whitespace characters or different parenthesis. Furthermore an NCName cannot begin with a number, dot or minus character although they can appear later in an NCName.

Where are NCNames needed

In namespace conformant XML documents all names must be either qualified names or NCNames. The following values must be NCNames (not qualified names):

  • namespace prefixes
  • values representing an ID
  • values representing an IDREF
  • values representing a NOTATION
  • processing instruction targets
  • entity names

NCName is non-colonized name e.g. "name". Compared to QName which is qualified name e.g. "ns:name". If your names are not supposed to be qualified by different namespaces, then they are NCNames.

xs:string puts no restrictions on your names at all, but xs:NCName basically disallows ":" to appear in the string.


Practically speaking...

Allowed characters: -, ., 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z, _, a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z

Also, - and . cannot be used as the first character of the value.

Disallowed characters: , !, ", #, $, %, &, ', (, ), *, +, ,, /, :, ;, <, =, >, ?, @, [, \, ], ^, `, {, |, }, ~


http://books.xmlschemata.org/relaxng/ch19-77215.html

No spaces or colons. Allows "_" and "-".

You would use this instead of string so that you can validate that the value is limited to what is allowed. It maps well to certain conventions for name/identifier like django's concept of "slug", for instance.

I upvote the person who [\i-[:]][\c-[:]]* translates into English for us.