Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HXT: Left-Factoring Nondeterministic Arrows?

I'm trying to come to terms with Haskell's XML Toolbox (HXT) and I'm hitting a wall somewhere, because I don't seem to fully grasp arrows as a computational tool.

Here's my problem, which I hoped to illustrate a little better using a GHCi session:

> let parse p = runLA (xread >>> p) "<root><a>foo</a><b>bar</b><c>baz</c></root>"
> :t parse
parse :: LA XmlTree b -> [b]

So Parse is a small helper function that applies whatever arrow I give it to the trivial XML document

<root>
  <a>foo</a>
  <b>bar</b>
  <c>baz</c>
</root>

I define another helper function, this time to extract the text below a node with a given name:

> let extract s = getChildren >>> isElem >>> hasName s >>> getChildren >>> getText 
> :t extract
extract :: (ArrowXml cat) =>
   String -> cat (Data.Tree.NTree.TypeDefs.NTree XNode) String
> parse (extract "a" &&& extract "b") -- extract two nodes' content.
[("foo","bar")]

With the help of this function, it's easy to use the &&& combinator to pair up the text of two different nodes, and then, say, pass it to a constructor, like this:

> parse (extract "a" &&& extract "b" >>^ arr (\(a,b) -> (b,a))) 
[("bar","foo")]

Now comes the part I don't understand: I want to left-factor! extract calls getChildren on the root-node twice. Instead, I'd like it to only call it once! So I first get the child of the root node

> let extract' s = hasName s >>> getChildren >>> getText
> :t extract'
extract' :: (ArrowXml cat) => String -> cat XmlTree String
> parse (getChildren >>> isElem >>> (extract' "a" &&& extract' "b"))
[]

Note, that I've tried to re-order the calls to, say, isElem, etc. in order to find out if that's the issue. But as it stands, I just don't have any idea why this isn't working. There is an arrow 'tutorial' on the Haskell wiki and the way I understood it, it should be possible to do what I want to do that way — namely use &&& in order to pair up the results of two computations.

It does work, too — but only at the start of the arrow-chain, not mid-way trough, when I have some results already, that I want to keep 'shared.' I have the feeling that I'm just not being able to wrap my head around a difference in ideas between normal function composition and arrow notation. I'd be very appreciative of any pointers! (Even if it is just to some generic arrow-tutorial that goes a little more in-depth than the on the Haskell-wiki.)

Thank you!

like image 427
Aleksandar Dimitrov Avatar asked Nov 18 '10 22:11

Aleksandar Dimitrov


1 Answers

If you convert the arrow to (and then from) a deterministic version this works as expected:

> let extract' s = unlistA >>> hasName s >>> getChildren >>> getText
> parse (listA (getChildren >>> isElem) >>> (extract' "a" &&& extract' "b"))
[("foo","bar")]

This isn't really satisfactory, though, and I can't remember off the top of my head why (&&&) behaves this way with a nondeterministic arrow (I'd personally use the proc/do notation for anything much more complicated than this).


UPDATE: There seems to be something weird going on here with runLA and xread. If you use runX and readString everything works as expected:

> let xml = "<root><a>foo</a><b>bar</b><c>baz</c></root>"
> let parse p = runX (readString [] xml >>> p)
> let extract' s = getChildren >>> hasName s >>> getChildren >>> getText
> parse (getChildren >>> isElem >>> (extract' "a" &&& extract' "b"))
[("foo","bar")]

This means you have to run the parser in the IO monad, but there are advantages to using runX anyway (better error messages, etc.).

like image 168
Travis Brown Avatar answered Sep 27 '22 21:09

Travis Brown