Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

non valid output of broadcast handler in Common Lisp Closure XML package

Following the answers provided for my last question How to inject elements into character content with Closure XML? I implemented a subclass of cxml:sax-proxy handler (a particular case of a broadcast handler). Unfortunately, looks like a bug in the library but it try to produce the XML with internal document type definitions but the doc is a non valid XML.

That is, running the parser with the command:

(with-open-file (out #P"teste.xml" :if-exists :supersede :direction :output)
       (let ((h (make-instance 'preproc :chained-handler (cxml:make-character-stream-sink out))))
     (cxml:parse #P"harem.xml" h :validate t)))

where the file harem.xml begins with (see the doctype):

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE colHAREM SYSTEM "harem.dtd">
<colHAREM versao="Segundo_dourada_com_relacoes_14Abril2010">
  <DOC DOCID="H2-dftre765">
    <p>...

the command produces in the teste.xml output file:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE colHAREM SYSTEM "harem.dtd"<!ELEMENT EM #PCDATA>
<!ATTLIST EM ID CDATA #REQUIRED>
<!ATTLIST EM CATEG CDATA #IMPLIED>
<!ATTLIST EM TIPO CDATA #IMPLIED>
<!ATTLIST EM COMENT CDATA #IMPLIED>
<!ATTLIST EM SUBTIPO CDATA #IMPLIED>
<!ELEMENT ALT (#PCDATA|EM)*>
<!ELEMENT OMITIDO (#PCDATA|EM|ALT|p)*>
<!ELEMENT colHAREM (DOC)*>
<!ATTLIST colHAREM versao CDATA #REQUIRED>
<!ELEMENT p (#PCDATA|EM|OMITIDO|ALT)*>
<!ATTLIST p xml:space (default|preserve) "default">
<!ELEMENT DOC (#PCDATA|p|OMITIDO)*>
<!ATTLIST DOC DOCID CDATA #REQUIRED>
>
<colHAREM versao="Segundo_dourada_com_relacoes_14Abril2010">
...

That is, the handler writes the DTD inside the output but in the wrong way, without the declarations inside the [ and ]. Is it a bug in the library or in my code?

like image 663
Alexandre Rademaker Avatar asked Nov 01 '22 14:11

Alexandre Rademaker


1 Answers

I traced through the steps CXML takes for your example and prepared a patch here (the first file, against the last CXML commit, 991fac513dbd9b86628f99741a66d791552b1f02, apply with git apply 0001-....patch in the root of the checked out CXML repository). To me it looks like the code path here just doesn't trigger the SAX event for the DTD subset, so after adding that the output has the necessary "[" / "]" added.

Can you please verify that this works for you? I'm also not sure if SAX:START-INTERNAL-SUBSET is actually correct, but it seems to do the job here.

like image 171
ferada Avatar answered Nov 11 '22 09:11

ferada