I tried:
document.doctype = xml.dom.minidom.DocumentType('html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/xhtml1-strict.dtd"')
There is no doctype in the output. How to fix without inserting it by hand?
xml. dom. minidom is a minimal implementation of the Document Object Model interface, with an API similar to that in other languages. It is intended to be simpler than the full DOM and also significantly smaller.
The XML Document Object Model (DOM) class is an in-memory representation of an XML document. The DOM allows you to programmatically read, manipulate, and modify an XML document. The XmlReader class also reads XML; however, it provides non-cached, forward-only, read-only access.
The Document Object Model, or “DOM,” is a cross-language API from the World Wide Web Consortium (W3C) for accessing and modifying XML documents. A DOM implementation presents an XML document as a tree structure, or allows client code to build such a structure from scratch.
You shouldn't instantiate classes from minidom
directly. It's not a supported part of the API, the ownerDocument
s won't tie up and you can get some strange misbehaviours. Instead use the proper DOM Level 2 Core methods:
>>> imp= minidom.getDOMImplementation('')
>>> dt= imp.createDocumentType('html', '-//W3C//DTD XHTML 1.0 Strict//EN', 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd')
(‘DTD/xhtml1-strict.dtd’ is a commonly-used but wrong SystemId
. That relative URL would only be valid inside the xhtml1 folder at w3.org.)
Now you've got a DocumentType
node, you can add it to a document. According to the standard, the only guaranteed way of doing this is at document creation time:
>>> doc= imp.createDocument('http://www.w3.org/1999/xhtml', 'html', dt)
>>> print doc.toxml()
<?xml version="1.0" ?><!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Strict//EN' 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'><html/>
If you want to change the doctype of an existing document, that's more trouble. The DOM standard doesn't require that DocumentType
nodes with no ownerDocument
be insertable into a document. However some DOMs allow it, eg. pxdom
. minidom
kind of allows it:
>>> doc= minidom.parseString('<html xmlns="http://www.w3.org/1999/xhtml"><head/><body/></html>')
>>> dt= minidom.getDOMImplementation('').createDocumentType('html', '-//W3C//DTD XHTML 1.0 Strict//EN', 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd')
>>> doc.insertBefore(dt, doc.documentElement)
<xml.dom.minidom.DocumentType instance>
>>> print doc.toxml()
<?xml version="1.0" ?><!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Strict//EN' 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'><html xmlns="http://www.w3.org/1999/xhtml"><head/><body/></html>
but with bugs:
>>> doc.doctype
# None
>>> dt.ownerDocument
# None
which may or may not matter to you.
Technically, the only reliable way per the standard to set a doctype on an existing document is to create a new document and import the whole of the old document into it!
def setDoctype(document, doctype):
imp= document.implementation
newdocument= imp.createDocument(doctype.namespaceURI, doctype.name, doctype)
newdocument.xmlVersion= document.xmlVersion
refel= newdocument.documentElement
for child in document.childNodes:
if child.nodeType==child.ELEMENT_NODE:
newdocument.replaceChild(
newdocument.importNode(child, True), newdocument.documentElement
)
refel= None
elif child.nodeType!=child.DOCUMENT_TYPE_NODE:
newdocument.insertBefore(newdocument.importNode(child, True), refel)
return newdocument
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With