Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

BASH SHELL SCRIPT to split a big xml file into multiple small files

I have a an XML file in below format

<?xml version="1.0" encoding="utf-8" ?>
<parent>
    <child>
        <code></code>
        <text></text>
    </child>
    <child>
        <code></code>
        <text></text>
    </child>
 </parent>

I need a BASH SHELL script to split this main xml file into multiple small XML files which should have contents from the <child> to </child> tag. File names could be parent file name plus a running serial number such as _1 for ex:20110721_1.xml etc.. Please help me with the script.

like image 436
Balaji Avatar asked Aug 18 '11 11:08

Balaji


2 Answers

Not pure answer but you can tune this yourself:

csplit -ksf part. src.xml /\<child\>/ "{100}" 2>/dev/null

This command will split src.xml using regexp /\<child\>/ as a delimiter and produce 1..100 part.* files. You need to play with regexp though...

like image 106
NilColor Avatar answered Oct 16 '22 16:10

NilColor


One solution is to write a XSL file and use xsltproc with the stylesheet and the xml file to generate the single files.

See How to split XML file into many XML files using XSLT for an example.

like image 33
cweiske Avatar answered Oct 16 '22 15:10

cweiske