Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

lxml.etree.SerialisationError: IO_ENCODER ERROR when using cabby/libtaxii

The company I work for has a production TAXII server (using STIX 1.1.1) that isn't quite working with some of our clients' client TAXII implementations, so I'm developing a test server to try to work out some of the bugs. For this, I've both been pulling down information from our TAXII server using cabby, or just pulling STIX/XML files directly from our (non-TAXII) API and slotting them directly into the test server backend. One of the issues I'm running into iswhile using cabby both on the production and test TAXII server is this error in the python lxml library, which is a dependency of Cabby (this is just the bottom of a larger stack trace)

taxii_xml = response_message.to_xml(pretty_print=True)
  File "/usr/local/lib/python3.6/dist-packages/libtaxii/common.py", line 239, in to_xml
    return etree.tostring(self.to_etree(), pretty_print=pretty_print)
  File "src/lxml/etree.pyx", line 3435, in lxml.etree.tostring
  File "src/lxml/serializer.pxi", line 139, in lxml.etree._tostring
  File "src/lxml/serializer.pxi", line 199, in lxml.etree._raiseSerialisationError
lxml.etree.SerialisationError: IO_ENCODER

I've been hunting trying to find what in the XML is causing this error but I'm not having a lot of success. Attempt to filter out possibly objectionable characters from the XML have been partially successful, but I'm also not really sure that's what's causing this problem. Does anyone have a good explanation for what exactly causes this error in lxml, I assume it has something to do with XML formatting but figuring out what kind of malformatting causes it would be extremely helpful

like image 216
jfeldzy Avatar asked Oct 29 '25 22:10

jfeldzy


2 Answers

Passing the correct encoding to your tostring call fixes it in most cases:

import lxml.etree as ET

root = ET.parse('some_file.xml')
outstr = ET.tostring(root, encoding='UTF-8', pretty_print=True).decode()

Without the encoding='UTF-8' parameter I get the SerialisationError IO_ENCODER. When I added the extra encoding parameter and decode() on the result, they all disappeared.

Downgrading to a version less than 4 seems a bit overkill, also versions < 4 don't compile on latest macOS either.

like image 116
Walter Schreppers Avatar answered Oct 31 '25 12:10

Walter Schreppers


Did you happen to migrate your system to a newer one?

An old system has this error with a lxml 4.5 in our case.

Rolling it back toward 2.3 solved the error:

sudo su
pip uninstall lxml
apt-get install libxml2-dev libxslt1-dev
pip install lxml==2.3
like image 41
goodhyun Avatar answered Oct 31 '25 10:10

goodhyun