I'm currently starting work on a project where I will have to write code (pl/sql) to export large XML files, based upon several tables in a database.
The export files could become quite large and could contain up to 700.000 customers (with their addresses, orders, telephone numbers, etc..).
I was wondering if anyone had some tips on the best approach for this. I could obviously just write out selects with lost of XMLELEMENTS in them, but that would mean the whole file would be generated in memory.
There's also a XML schema (XSD) available to which the files have to comply. I was also wondering if there's any way to "map" the table to the XML schema.
Any tips are appreciated.
XML has some... shortcomings in this area. Large XML files, as you note, can hog RAM and UNDO like there's no tomorrow.
I don't honestly believe that something called "best practices" exists, it all depends on your own database, server and queries. However, here's what a colleague's (I can't claim credit) just done in order to write huge amounts (4.5GB) of XML to disk from a large number (20?) of large tables (10-400m rows) with extremely complex sub-queries.
Actually write out all of those XMLElements
If your SELECT statement is at all complex create a table first.
Select from the table, taking a reasonable element, hopefully based on your ID. For instance if you have the following structure it would make sense to split it on <record>
<someXML>
<record ID="1">
<blah>
<moreBlah/>
</blah>
</record>
<record ID="2">
<blah>
<moreBlah/>
</blah>
</record>
</someXML>
Select each record as a CLOB from the database. You then end up with a series of CLOBs that will make up your output XML.
Write the opening tag first then individually, or in chunks, write each CLOB to disk
Ensure you write to disk locally. If it's not avoidable write to a network share where there's a have a big fat cable pointing at it. You can always move your file afterwards and this'll be more efficient than writing across the network (or a city/country) in chunks.
Parallelize! This isn't always possible but if you can do it then do so.
Be careful of parallelizing. You don't want to be writing malformed XML.
I'm effectively advocating tbone's approach, save doing it in chunks instead. Whatever you do avoid putting the entire thing in memory.
Try using DBMS_XMLGEN first. There are other approaches as well, see this Oracle XML DB doc
DECLARE
v_ctx DBMS_XMLGEN.ctxhandle;
v_file UTL_FILE.file_type;
v_xml CLOB;
v_more BOOLEAN := TRUE;
BEGIN
-- Create XML context.
v_ctx := DBMS_XMLGEN.newcontext('SELECT table_name, tablespace_name FROM user_tables WHERE rownum < 6');
-- Set parameters to alter default Rowset and Row tag names and default case.
DBMS_XMLGEN.setrowsettag(v_ctx, 'USER_TABLES');
DBMS_XMLGEN.setrowtag(v_ctx, 'TABLE');
--DBMS_XMLGEN.settagcase(v_ctx, DBMS_XMLGen.LOWER_CASE);
-- Add an IE specfic XSL stylesheet reference so browser can transform the file.
--DBMS_XMLGEN.setstylesheetheader(v_ctx, 'C:\Development\XML\IEStyle.xsl', 'text/xsl');
-- Create the XML document.
v_xml := DBMS_XMLGEN.getxml(v_ctx);
DBMS_XMLGEN.closecontext(v_ctx);
-- Output XML document to file.
v_file := UTL_FILE.fopen('C:\Development\XML\', 'test1.xml', 'w');
WHILE v_more LOOP
UTL_FILE.put(v_file, SUBSTR(v_xml, 1, 32767));
IF LENGTH(v_xml) > 32767 THEN
v_xml := SUBSTR(v_xml, 32768);
ELSE
v_more := FALSE;
END IF;
END LOOP;
UTL_FILE.fclose(v_file);
-- test insert into table
/*
insert into t_clob (clob_col) values (v_xml);
commit;
*/
EXCEPTION
WHEN OTHERS THEN
DBMS_OUTPUT.put_line(Substr(SQLERRM,1,255));
UTL_FILE.fclose(v_file);
END;
Note that I borrowed most of this from the excellent oracle-base site
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With