Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Asynchronous XmlReader in .NET?

Tags:

c#

xml

xmpp

Is there a way to access a XmlReader asynchronously? The xml is coming in off the network from many different clients like in XMPP; it is a constant stream of <action>...</action> tags.

What i'm after is to be able to use a BeginRead/EndRead-like interface. The best solution I've managed to come up with is to do an asynchronous read for 0 bytes on the underlying network stream, then when some data arrives, call Read on the XmlReader- this will however block until all of the data from the node becomes available. That solution looks roughly like this

private Stream syncstream;
private NetworkStream ns;
private XmlReader reader;

//this code runs first
public void Init()
{
    syncstream = Stream.Synchronized(ns);
    reader = XmlReader.Create(syncstream);
    byte[] x = new byte[1];
    syncstream.BeginRead(x, 0, 0, new AsynchronousCallback(ReadCallback), null);
}

private void ReadCallback(IAsyncResult ar)
{
    syncstream.EndRead(ar);
    reader.Read(); //this will block for a while, until the entire node is available
    //do soemthing to the xml node
    byte[] x = new byte[1];
    syncstream.BeginRead(x, 0, 0, new AsynchronousCallback(ReadCallback), null);
}

EDIT: This is a possible algorithm for working out if a string contains a complete xml node?

Func<string, bool> nodeChecker = currentBuffer =>
                {
                    //if there is nothing, definetly no tag
                    if (currentBuffer == "") return false;
                    //if we have <![CDATA[ and not ]]>, hold on, else pass it on
                    if (currentBuffer.Contains("<![CDATA[") && !currentBuffer.Contains("]]>")) return false;
                    if (currentBuffer.Contains("<![CDATA[") && currentBuffer.Contains("]]>")) return true;
                    //these tag-related things will also catch <? ?> processing instructions
                    //if there is a < but no >, we still have an open tag
                    if (currentBuffer.Contains("<") && !currentBuffer.Contains(">")) return false;
                //if there is a <...>, we have a complete element.
                //>...< will never happen because we will pass it on to the parser when we get to >
                if (currentBuffer.Contains("<") && currentBuffer.Contains(">")) return true;
                //if there is no < >, we have a complete text node
                if (!currentBuffer.Contains("<") && !currentBuffer.Contains(">")) return true;
                //> and no < will never happen, we will pass it on to the parser when we get to >
                //by default, don't block
                return false;
            };
like image 360
KJ Tsanaktsidis Avatar asked Feb 15 '10 02:02

KJ Tsanaktsidis


2 Answers

XmlReader buffers in 4kB chunks, if I remember from when I looked in to this a couple of years ago. You could pad your inbound data to 4kB (ick!), or use a better parser. I fixed this by porting James Clark's XP (Java) to C# as a part of Jabber-Net, here:

http://code.google.com/p/jabber-net/source/browse/#svn/trunk/xpnet

It's LGPL, only handles UTF8, isn't packaged for use, and has almost no documentation, so I wouldn't recommend using it. :)

like image 144
Joe Hildebrand Avatar answered Nov 02 '22 21:11

Joe Hildebrand


The easiest thing to do is just put it on another thread, perhaps a ThreadPool depending on how long it stays active. (Don't use thread pool threads for truly long-running tasks).

like image 2
kyoryu Avatar answered Nov 02 '22 22:11

kyoryu