Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Haskell HXT: Parsing xml documents with remote DTD without hxt-curl

Tags:

haskell

hxt

I'm trying to parse the following XML document with HXT:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Key</key>
    <string>Value</string>
</dict>
</plist>

I don't want any validation here since it will require network access. Unfortunately, HXT still wants hxt-curl / hxt-http packages installed to parse this simple document:

Prelude> :m +Text.XML.HXT.Core
Prelude Text.XML.HXT.Core> runX $ readDocument [withValidate no] "example.xml"

fatal error: HTTP handler not configured,
please install package hxt-curl and use 'withCurl' config option
or install package hxt-http and use 'withHTTP' config option

I don't want to add hxt-curl/hxt-http packages to the list of dependencies since I don't really need them. I can't change the documents I'm parsing. Moving to another xml parsing library is also undesirable.

Is there a way to parse the sample document with HXT without adding unnecessary packages?

like image 332
roman-kashitsyn Avatar asked Apr 04 '14 07:04

roman-kashitsyn


1 Answers

You have also to declare withSubstDTDEntities no, i. e.

runX $ readDocument [withValidate no, withSubstDTDEntities no] "example.xml"

Explanation: The default for this config is yes and I guess that's why hxt tries to download the dtd file. From the documentation:

Switching this option and the validation off can lead to faster parsing, in that case reading the DTD documents is not longer necessary.

like image 151
Stephan Kulla Avatar answered Sep 22 '22 23:09

Stephan Kulla