In our business, we receive and need to process thousands of XML files per day and all these files are in the same format. We would like to store these data into oracle tables and reserve the hierarchical relationship of these data and then we can query them using traditional SQL and do further analysis. What is the best way to do that? Is XML DB the right choice?
Currently, I am thinking using XML DB with data stored in structured storage. So I understand that I can define a XML Schema with annotations and I know the steps involved but I would like to get confirmed answers for the following questions:
The need for saving XML files to a database is quite common since XML data can be found everywhere; and DataDirect XQuery has an easy and highly scalable solution for doing just that.
Oracle XML DB provides full support for all of the key XML standards, including XML, Namespaces, DOM, XQuery, SQL/XML and XSLT. By providing full support for XML standards, Oracle XML DB supports native XML application development.
In SQL Server, you usually store XML data in a column configured with the xml data type. The data type supports several methods that let you query and modify individual elements, attributes, and their values directly within the XML instance, rather than having to work with that instance as a whole.
You definitely want to begin with XMLDB.
XMLDB is a whole world of features and functionality within itself.
Very briefly, you have three storage options w/ XMLDB and XMLTYPE data.
You can store:
1.) Into a CLOB datatype. If you do this, the XML just sits in the database, and it's a LOB. You can't index it, search it, etc. The database is a bit bucket, and you store the XML.
2.) BINARY XML: This is the newest option, introduced in 11gR2. This will tokenize and compress the XML, and store in in an encoded binary format. The advantage here, is that you can then use Oracle Text and XMLINDEX domain indexes, to index and search the content. This is the option I have the most experience with. We have a billion documents, average size around 12k, stored in an 11gR2 database. It's working really well for us.
3.) XML to relational mapping: You define a relational schema to store the data in your XML schema, and define how the XML elements map to which relational tables and fields. On ingestion, the XML is decomposed and stored into a conventional relational model. You can then search, sort, index, as you would any traditional database.
It really depends on what you want to do, as to which XML storage model you choose to go with.
You may try asking your questions on the Oracle XMLDB forum. It's pretty active, and has some folks that are a lot more knowledgeable on the subject than I am. https://forums.oracle.com/forums/forum.jspa?forumID=34
Hope that helps.
Reply to Update 1:
I really only have experience with Binary XML storage option, sorry. I think that checking out the XMLDB Sample Code on OTN, would be useful.
Please see:
http://www.oracle.com/technetwork/indexes/samplecode/xmldb-sample-523617.html
There is a good book, I used it and it helped me through all the fundamental questions. It is about Oracle 11g, but it is also usable with 12c.
It is called "Building Oracle XML DB Applications", I would suggest, buy it and reserve two days for reading and trying.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With