Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Oracle: loading a large xml file?

Tags:

sql

xml

oracle

So now that I have a large bit of XML data I'm interested in:

https://blog.stackoverflow.com/2009/06/stack-overflow-creative-commons-data-dump

I'd like to load this into Oracle to play with.

How can I directly load a large XML file directly into Oracle? Server-side solutions (where the data file can be opened on the server) and client-side solutions welcomed.

Here's a bit of badges.xml for a concrete example.

<?xml version="1.0" encoding="UTF-8" ?>
  <badges>
  <row UserId="3718" Name="Teacher" Date="2008-09-15T08:55:03.923"/>
  <row UserId="994" Name="Teacher" Date="2008-09-15T08:55:03.957"/>
  ...
like image 263
Mark Harrison Avatar asked Jun 15 '09 19:06

Mark Harrison


People also ask

Why is my XML file so big?

Often times, the large size of XML structures is due to the fact that they are an XML representation of a database dump. There might be redundant or even useless information that you could discard with an XSLT transformation.

What is the size of Xmltype in Oracle?

XML Identifier Length Limit – Oracle XML DB supports only XML identifiers that are 4000 characters long or shorter.


2 Answers

You can access the XML files on the server via SQL. With your data in the /tmp/tmp.xml, you would first declare the directory:

SQL> create directory d as '/tmp';

Directory created

You could then query your XML File directly:

SQL> SELECT XMLTYPE(bfilename('D', 'tmp.xml'), nls_charset_id('UTF8')) xml_data
  2    FROM dual;

XML_DATA
--------------------------------------------------------------------------------
<?xml version="1.0" encoding="UTF-8"?>
<badges>
  [...]

To access the fields in your file, you could use the method described in another SO for example:

SQL> SELECT UserId, Name, to_timestamp(dt, 'YYYY-MM-DD"T"HH24:MI:SS.FF3') dt
  2    FROM (SELECT XMLTYPE(bfilename('D', 'tmp.xml'), 
                            nls_charset_id('UTF8')) xml_data
  3            FROM dual),
  4         XMLTable('for $i in /badges/row
  5                              return $i'
  6                  passing xml_data
  7                  columns UserId NUMBER path '@UserId',
  8                          Name VARCHAR2(50) path '@Name',
  9                          dt VARCHAR2(25) path '@Date');

    USERID NAME       DT                         
---------- ---------- ---------------------------
      3718 Teacher    2008-09-15 08:55:03.923    
       994 Teacher    2008-09-15 08:55:03.957    
like image 66
Vincent Malgrat Avatar answered Oct 15 '22 12:10

Vincent Malgrat


Seems like you're talking about 2 issues -- first, getting the XML document to where Oracle can see it. And then maybe making it so that standard relational tools can be applied to the data.

For the first, you or your DBA can create a table with a BLOB, CLOB, or BFILE column and load the data. If you have access to the server on which the database lives, you can define a DIRECTORY object in the database that points to an operating system directory. Then put your file there. And then either set it up as a BFILE or read it in. (CLOB and BLOB store in the database; BFILE stores a pointed to a file on the operating system side).

Alternatively , use some tool that will let you directly write CLOBs to the database. Anyway, that gets you to the point where you can see the XML instance document in the database.

So now you have the instance document visible. Step 1 is done.

Depending on the version, Oracle has some pretty good tools for shredding the XML into relational tables.

It can be pretty declarative. While this gets beyond what I've actually done (I have a project where I'll be trying it this fall), you can theoretically load your XML Schema into the database and annotate it with the crosswalk between the relational tables and the XML. Then take your CLOB or BFILE and convert it to an XMLTYPE column with the defined schema and you're done -- the shredding happens automatically, the data is all there, it's all relational, it's all available to standard SQL without the XQUERY or XML extensions.

Of course, if you'd rather use XQUERY, then just take the CLOB or BFILE, convert it to an XMLTYPE, and go for it.

like image 3
Jim Hudson Avatar answered Oct 15 '22 13:10

Jim Hudson