The command
$ xmlstarlet sel -t -c "/collection/record" file.xml
seems to load the whole file into memory, before applying the given Xpath expression. This is not usable for large XML files.
Does xmlstarlet
provide a streaming mode to extract subelements from a large (100G+) XML file?
Since I only needed a tiny subset of XPath for large XML files, I actually implemented a little tool myself: xmlcutty.
The example from my question could be written like this:
$ xmlcutty -path /collection/record file.xml
Xmlstarlet translates all (or most) operations into xslt transformations, so the short answer is no.
You could try to use stx, which is streaming transformation language similar to xslt. On the other hand, just coding something together in python using sax or iterparse may be easier and faster (wrt time needed to create code) if you don't care about xml that much.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With