Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generating large xml files from oracle: Best practices

Tags:

oracle

plsql

I'm currently starting work on a project where I will have to write code (pl/sql) to export large XML files, based upon several tables in a database.

The export files could become quite large and could contain up to 700.000 customers (with their addresses, orders, telephone numbers, etc..).

I was wondering if anyone had some tips on the best approach for this. I could obviously just write out selects with lost of XMLELEMENTS in them, but that would mean the whole file would be generated in memory.

There's also a XML schema (XSD) available to which the files have to comply. I was also wondering if there's any way to "map" the table to the XML schema.

Any tips are appreciated.

like image 767
ErikL Avatar asked Jun 03 '13 13:06

ErikL


Video Answer


2 Answers

XML has some... shortcomings in this area. Large XML files, as you note, can hog RAM and UNDO like there's no tomorrow.

I don't honestly believe that something called "best practices" exists, it all depends on your own database, server and queries. However, here's what a colleague's (I can't claim credit) just done in order to write huge amounts (4.5GB) of XML to disk from a large number (20?) of large tables (10-400m rows) with extremely complex sub-queries.

  • Actually write out all of those XMLElements

  • If your SELECT statement is at all complex create a table first.

  • Select from the table, taking a reasonable element, hopefully based on your ID. For instance if you have the following structure it would make sense to split it on <record>

    <someXML>
        <record ID="1">
            <blah>
                <moreBlah/>
            </blah>
        </record>
        <record ID="2">
            <blah>
                <moreBlah/>
            </blah>
        </record>
    </someXML>
    
  • Select each record as a CLOB from the database. You then end up with a series of CLOBs that will make up your output XML.

  • Write the opening tag first then individually, or in chunks, write each CLOB to disk

  • Ensure you write to disk locally. If it's not avoidable write to a network share where there's a have a big fat cable pointing at it. You can always move your file afterwards and this'll be more efficient than writing across the network (or a city/country) in chunks.

  • Parallelize! This isn't always possible but if you can do it then do so.

  • Be careful of parallelizing. You don't want to be writing malformed XML.

I'm effectively advocating tbone's approach, save doing it in chunks instead. Whatever you do avoid putting the entire thing in memory.

like image 168
Ben Avatar answered Oct 05 '22 15:10

Ben


Try using DBMS_XMLGEN first. There are other approaches as well, see this Oracle XML DB doc

DECLARE
  v_ctx   DBMS_XMLGEN.ctxhandle;
  v_file  UTL_FILE.file_type;
  v_xml   CLOB;
  v_more  BOOLEAN := TRUE;
BEGIN
  -- Create XML context.
  v_ctx := DBMS_XMLGEN.newcontext('SELECT table_name, tablespace_name FROM user_tables WHERE rownum < 6');

  -- Set parameters to alter default Rowset and Row tag names and default case.
  DBMS_XMLGEN.setrowsettag(v_ctx, 'USER_TABLES'); 
  DBMS_XMLGEN.setrowtag(v_ctx, 'TABLE'); 
  --DBMS_XMLGEN.settagcase(v_ctx, DBMS_XMLGen.LOWER_CASE);

  -- Add an IE specfic XSL stylesheet reference so browser can transform the file.
  --DBMS_XMLGEN.setstylesheetheader(v_ctx, 'C:\Development\XML\IEStyle.xsl', 'text/xsl');

  -- Create the XML document.
  v_xml := DBMS_XMLGEN.getxml(v_ctx);
  DBMS_XMLGEN.closecontext(v_ctx);

  -- Output XML document to file.
  v_file := UTL_FILE.fopen('C:\Development\XML\', 'test1.xml', 'w');
  WHILE v_more LOOP
    UTL_FILE.put(v_file, SUBSTR(v_xml, 1, 32767));
    IF LENGTH(v_xml) > 32767 THEN
      v_xml :=  SUBSTR(v_xml, 32768);
    ELSE
      v_more := FALSE;
    END IF;
  END LOOP;
  UTL_FILE.fclose(v_file);

  -- test insert into table
  /*
  insert into t_clob (clob_col) values (v_xml);
  commit;
  */
EXCEPTION
  WHEN OTHERS THEN
    DBMS_OUTPUT.put_line(Substr(SQLERRM,1,255));
    UTL_FILE.fclose(v_file);
END;

Note that I borrowed most of this from the excellent oracle-base site

like image 25
tbone Avatar answered Oct 05 '22 15:10

tbone