What is a good general regex (in PHP terms) to determine if a string is a valid XML tag name?
I startet using /[^>]+/i
but that also matches something like 4 \<<
which obviously isn't a valid tag name.
So I tried combining all valid characters like /[a-z][a-z0-9_-]*/i
which also isn't quite right, as XML allows virtually any character in tag names also of foreign languages.
I'm stuck on that now - should I just check if there are whitespace characters? Or is there more to it?
why dont you just use an XML parser/generator which already knows the rules?
function isValidXmlElementName($elementName)
{
try {
new DOMElement($elementName);
} catch (DOMException $e) {
return false;
}
return true;
}
var_dump(isValidXmlElementName(' ')); // false
var_dump(isValidXmlElementName('1')); // false
var_dump(isValidXmlElementName('-')); // false
var_dump(isValidXmlElementName('a')); // true
From the XML specification:
[4] NameStartChar ::= ":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
[4a] NameChar ::= NameStartChar | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]
[5] Name ::= NameStartChar (NameChar)*
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With