Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Twisted XmlStream: How to connect to events?

I would like to implement a Twisted server that expects XML requests and sends XML responses in return:

<request type='type 01'><content>some request content</content></request>
<response type='type 01'><content>some response content</content></response>
<request type='type 02'><content>other request content</content></request>
<response type='type 02'><content>other response content</content></response>

I have created a Twisted client & server before that exchanged simple strings and tried to extend that to using XML, but I can't seem to figure out how to set it all up correctly.

client.py:

#!/usr/bin/env python
# encoding: utf-8

from twisted.internet             import reactor
from twisted.internet.endpoints   import TCP4ClientEndpoint, connectProtocol
from twisted.words.xish.domish    import Element, IElement
from twisted.words.xish.xmlstream import XmlStream

class XMLClient(XmlStream):

    def sendObject(self, obj):
        if IElement.providedBy(obj):
            print "[TX]: %s" % obj.toXml()
        else:
            print "[TX]: %s" % obj
        self.send(obj)

def gotProtocol(p):
    request = Element((None, 'request'))
    request['type'] = 'type 01'
    request.addElement('content').addContent('some request content')
    p.sendObject(request)

    request = Element((None, 'request'))
    request['type'] = 'type 02'
    request.addElement('content').addContent('other request content')

    reactor.callLater(1, p.sendObject, request)
    reactor.callLater(2, p.transport.loseConnection)

endpoint = TCP4ClientEndpoint(reactor, '127.0.0.1', 12345)
d = connectProtocol(endpoint, XMLClient())
d.addCallback(gotProtocol)

from twisted.python import log
d.addErrback(log.err)

reactor.run()

As in the earlier string-based approach mentioned, the client idles until CTRL+C. Once I have this going, it will draw some / a lot of inspiration from the Twisted XMPP example.

server.py:

#!/usr/bin/env python
# encoding: utf-8

from twisted.internet             import reactor
from twisted.internet.endpoints   import TCP4ServerEndpoint
from twisted.words.xish.xmlstream import XmlStream, XmlStreamFactory
from twisted.words.xish.xmlstream import STREAM_CONNECTED_EVENT, STREAM_START_EVENT, STREAM_END_EVENT

REQUEST_CONTENT_EVENT = intern("//request/content")

class XMLServer(XmlStream):
    def __init__(self):
        XmlStream.__init__(self)
        self.addObserver(STREAM_CONNECTED_EVENT, self.onConnected)
        self.addObserver(STREAM_START_EVENT,     self.onRequest)
        self.addObserver(STREAM_END_EVENT,       self.onDisconnected)
        self.addObserver(REQUEST_CONTENT_EVENT,  self.onRequestContent)

    def onConnected(self, xs):
        print 'onConnected(...)'

    def onDisconnected(self, xs):
        print 'onDisconnected(...)'

    def onRequest(self, xs):
        print 'onRequest(...)'

    def onRequestContent(self, xs):
        print 'onRequestContent(...)'

class XMLServerFactory(XmlStreamFactory):
    protocol = XMLServer

endpoint = TCP4ServerEndpoint(reactor, 12345, interface='127.0.0.1')
endpoint.listen(XMLServerFactory())
reactor.run()

client.py output:

TX [127.0.0.1]: <request type='type 01'><content>some request content</content></request>
TX [127.0.0.1]: <request type='type 02'><content>other request content</content></request>

server.py output:

onConnected(...)
onRequest(...)
onDisconnected(...)

My questions:

  1. How do I subscribe to an event fired when the server encounters a certain XML tag ? The //request/content XPath query seems ok to me, but onRequestContent(...) does not get called :-(
  2. Is subclassing XmlStream and XmlStreamFactory a reasonable approach at all ? It feels weird because XMLServer subscribes to events sent by its own base class and is then passed itself (?) as xs parameter ?!? Should I rather make XMLServer an ordinary class and have an XmlStream object as class member ? Is there a canonical approach ?
  3. How would I add an error handler to the server like addErrback(...) in the client ? I'm worried exceptions get swallowed (happened before), but I don't see where to get a Deferred from to attach it to...
  4. Why does the server by default close the connection after the first request ? I see XmlStream.onDocumentEnd(...) calling loseConnection(); I could override that method, but I wonder if there's a reason for the closing I don't see. Is it not the 'normal' approach to leave the connection open until all communication necessary for the moment has been carried out ?

I hope this post isn't considered too specific; talking XML over the network is commonplace, but despite searching for a day and a half, I was unable to find any Twisted XML server examples. Maybe I manage to turn this into a jumpstart for anyone in the future with similar questions...

like image 596
ssc Avatar asked Apr 16 '15 19:04

ssc


2 Answers

This is mostly a guess but as far as I know you need to open the stream by sending a stanza without closing it.

In your example when you send <request type='type 01'><content>some request content</content></request> the server sees the <request> stanza as the start document but then you send </request> and the server will see that as the end document.

Basically, your server consumes <request> as the start document and that's also why your xpath, //request/content, will not match, because all that's left of the element is <content>...</content>.

Try sending something like <stream> from the client first, then the two requests and then </stream>.

Also, subclassing XmlStream is fine as long as you make sure you don't override any methods by default.

like image 98
Ionut Hulub Avatar answered Sep 26 '22 06:09

Ionut Hulub


The "only" relevant component of XmlStream is the SAX parser. Here's how I've implemented an asynchronous SAX parser using XmlStream and only the XML parsing functions:

server.py

from twisted.words.xish.domish import Element
from twisted.words.xish.xmlstream import XmlStream

class XmlServer(XmlStream):
    def __init__(self):
        XmlStream.__init__(self)    # possibly unnecessary

    def dataReceived(self, data):
        """ Overload this function to simply pass the incoming data into the XML parser """
        try:
            self.stream.parse(data)     # self.stream gets created after self._initializestream() is called
        except Exception as e:
            self._initializeStream()    # reinit the DOM so other XML can be parsed

    def onDocumentStart(self, elementRoot):
        """ The root tag has been parsed """
        print('Root tag: {0}'.format(elementRoot.name))
        print('Attributes: {0}'.format(elementRoot.attributes))

    def onElement(self, element):
        """ Children/Body elements parsed """
        print('\nElement tag: {0}'.format(element.name))
        print('Element attributes: {0}'.format(element.attributes))
        print('Element content: {0}'.format(str(element)))

    def onDocumentEnd(self):
        """ Parsing has finished, you should send your response now """
        response = domish.Element(('', 'response'))
        response['type'] = 'type 01'
        response.addElement('content', content='some response content')
        self.send(response.toXml())

Then you create a Factory class that will produce this Protocol (which you've demonstrated you're capable of). Basically, you will get all your information from the XML in the onDocumentStart and onElement functions and when you've reached the end (ie. onDocumentEnd) you will send a response based on the parsed information. Also, be sure you call self._initializestream() after parsing each XML message or else you'll get an exception. That should serve as a good skeleton for you.

My answers to your questions:

  1. Don't know :)
  2. It's very reasonable. However I usually just subclass XmlStream (which simply inherits from Protocol) and then use a regular Factory object.
  3. This is a good thing to worry about when using Twisted (+1 for you). Using the approach above, you could fire callbacks/errbacks as you parse and hit an element or wait till you get to the end of the XML then fire your callbacks to your hearts desire. I hope that makes sense :/
  4. I've wondered this too actually. I think it has something to do with the applications and protocols that use the XmlStream object (such as Jabber and IRC). Just overload onDocumentEnd and make it do what you want it to do. That's the beauty of OOP.

Reference:

  • xml.sax
  • iterparse
  • twisted.web.sux: Twisted XML SAX parser. This is actually what XmlStream uses to parse XML.
  • xml.etree.cElementTree.iterparse: Here's another Stackoverflow question - ElementTree iterparse strategy
  • iterparse is throwing 'no element found: line 1, column 0' and I'm not sure why - I've asked a similar question :)

PS

Your problem is quite common and very simple to solve (at least in my opinion) so don't kill yourself trying to learn the Event Dipatcher model. Actually it seems you have a good handle on callbacks and errbacks (aka Deferred), so I suggest you stick to those and avoid the dispatcher.

like image 22
notorious.no Avatar answered Sep 22 '22 06:09

notorious.no