Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merge multiple XML files from command line

I have several xml files. They all have the same structure, but were splitted due to file size. So, let's say I have A.xml, B.xml, C.xml and D.xml and want to combine/merge them to combined.xml, using a command line tool.

A.xml

<products>
    <product id="1234"></product>
    ...
</products>

B.xml

<products>
  <product id="5678"></product>
  ...
</products>

etc.

like image 296
TutanRamon Avatar asked Jan 25 '12 14:01

TutanRamon


People also ask

How do I combine multiple XML files into one?

To use this, create a new XSLT file (File > New > XSLT Stylesheet and place in it the stylesheet above. Save the file as "merge. xsl". You should also add the files (or folder) to an Oxygen project (Project view) and create a scenario of the "XML transformation with XSLT" type for one XML file.

Can we merge two XML files?

It is possible to use XML Merge as the underlying merge tool in your version control system for XML content.

How do I combine multiple XML files online?

To add files click anywhere in the blue area or on the Browse for file button to upload or drag and drop them. You can also add the documents by entering their URL in the URL cell. Click on the Merge button. Your MPP file will be uploaded and combined to the result format.


1 Answers

High-tech answer:

Save this Python script as xmlcombine.py:

#!/usr/bin/env python
import sys
from xml.etree import ElementTree

def run(files):
    first = None
    for filename in files:
        data = ElementTree.parse(filename).getroot()
        if first is None:
            first = data
        else:
            first.extend(data)
    if first is not None:
        print ElementTree.tostring(first)

if __name__ == "__main__":
    run(sys.argv[1:])

To combine files, run:

python xmlcombine.py ?.xml > combined.xml

For further enhancement, consider using:

  • chmod +x xmlcombine.py: Allows you to omit python in the command line

  • xmlcombine.py !(combined).xml > combined.xml: Collects all XML files except the output, but requires bash's extglob option

  • xmlcombine.py *.xml | sponge combined.xml: Collects everything in combined.xml as well, but requires the sponge program

  • import lxml.etree as ElementTree: Uses a potentially faster XML parser

like image 79
eswald Avatar answered Sep 28 '22 01:09

eswald