We've got an Oracle 11g installation that is starting to get big. This database is the backend to a parallel optimization system running on a cluster. Input to the process is contained in the database along with output from the optimization steps. The input includes rote configuration data and some binary files (using 11g's SecureFiles). The output includes 1D, 2D, 3D, and 4D data currently stored in the DB.
DB Structure:
/* Metadata tables */
Case(CaseId, DeleteFlag, ...) On Delete Cascade CaseId
OptimizationRun(OptId, CaseId, ...) On Delete Cascade OptId
OptimizationStep(StepId, OptId, ...) On Delete Cascade StepId
/* Data tables */
Files(FileId, CaseId, Blob) /* deletes are near instantateous here */
/* Data per run */
OnedDataX(OptId, ...)
TwoDDataY1(OptId, ...) /* packed representation of a 1D slice */
/* Data not only per run, but per step */
TwoDDataY2(StepId, ...)  /* packed representation of a 1D slice */
ThreeDDataZ(StepId, ...) /* packed representation of a 2D slice */
FourDDataZ(StepId, ...)  /* packed representation of a 3D slice */
/* ... About 10 or so of these tables exist */
A reaper script comes around daily and looks for cases with the DeleteFlag = 1 and proceeds with the DELETE FROM Case WHERE DeleteFlag = 1, allowing the cascades to continue.
This strategy works great for read/write, but is now outstripping our capabilities when we want to purge data! The rub is deleting a Case takes ~20-40 minutes depending on the size and often overloads our archiver space. The next major version of the product will take a "from the ground up" approach to solving the problem. The next minor release needs to stay within the confines of data stored in the database.
So, for the minor release we need an approach that can improve delete performance and at most require moderate changes to the database.
Case and REF on the rest, but that isn't supported. Is there some way to manually partition OptimizationRun by CaseId through a trigger?To help illustrate the issue, the data in question per case ranges from 15MiB to 1.5GiB with anywhere from 20k to 2M rows.
Update: Current size of the DB is ~1.5TB.
By far, the fastest way to delete a bunch of records is to use the TRUNCATE TABLE statement. This is much faster than the DELETE statement because it does not log any of the row-level delete operations. However, you can only use TRUNCATE TABLE : To delete ALL the records in the table.
What is Oracle Performance Tuning? Performance tuning is the process of administering a database to improve performance. Performance tuning in Oracle databases includes optimizing SQL statements and query execution plans so that the requests can be completed more efficiently.
Performance tuning using the Oracle performance method is driven by identifying and eliminating bottlenecks in the database, and by developing efficient SQL statements. Database tuning is performed in two phases: proactively and reactively.
Deleting data is a hell of a job, for the database. It has to create before images, update indexes, write redo logs and remove the data. This is a slow process. If you can have a window to perform this task, easiest and fastest is to build new tables, containing the wanted data. Drop the old tables and rename the new tables. This requires some setup work, that is obvious but is very well possible to make. One step less drastic is to drop the indexes before the delete takes place. My vote would go for CTAS (Create Table As Select from) and build the new tables. A nice partitioning schema would certainly be helpful, maybe in the next release Oracle can combine interval and reference partitioning. It would be very nice to have.
Disabling logging .... can not be done for deletes but CTAS can use nologging. Make a backup when ready and make sure to transfer the datafiles to the standby database, if you have one.
Just some thoughts:
I assume you have indexes on all foreign keys. ON DELETE CASCADE will hold row level locks until the Case delete is complete, and with no indexes will hold table locks I believe and be super slow of course
Do you have any deferred constraints? This would most likely slow things down for Oracle cascading through the various table deletes
Have you tried to do the deletes separately for all affected tables (instead of relying on on delete cascade)? Not as easy, but you may be surprised.
EDIT:
One more thought. You may consider doing a SOFT delete on Case table, meaning you have a status field that will tell your app if that Case should be considered. This flag could have many different values, but maybe 'A' for active and 'I' for inactive. Assuming you are always using Case as a driving/primary table in joins to other tables, you can avoid the HARD deletes all-together (and occasionally do a cleanup off hours on whatever schedule if you like). Apps would need to be aware of this flag of course, and you'd be tied to joining back to Case table. May or may not fit for your situation...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With